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PREFACE 


[n  February  l')67  RAND  was  convni.ssi.oned  by  the  Office  of  the  Assist¬ 
ant  Secretary  of  Defense  (Systems  Analysis)  to  prepare  a  text  on  the 
general  subject  of  cost  estimating  procedures.  Tills  memorandum  dealing 
with  fundamentals  of  cost  analysis  constitutes  the  Introductory  portion 
of  such  a  text.  The  complete  report  will  present  and  illustrate  methods 
and  techniques  fur  estimating  aircraft  and  missile  costs,  a  chapter  on 
operating  costs,  and  a  discussion  of  cost  models  in  addition  to  the  ma¬ 
terial  presented  here.  While  the  emphasis  is  to  be  on  aircraft  and  mis¬ 
siles,  the  techniques  illustrated  are  applicable  to  all  types  of  major 
equipment;  and  it  is  hoped  that  the  text  will  be  useful  throughout  Che 
Department  of  Defense. 


SUMMARY 


Tli  1m  memorandum  discusses  the  fundamental  problems  of  estimating 
major  equipment  i  os  I  s  ami  suggests  that  for  many  purposes  ,  particularly 
for  government  cost  analysts,  a  statistical  approach  is  the  most  suit¬ 
able.  The  kind  of  data  required  and  the  adjustments  needed  to  make  the 
data  useful  are  discussed  in  some  detail.  The  use  of  regression  analy¬ 
sis  in  deriving  rust -estimating  relationships  is  described,  but  it  is 
emphasized  that  unquestioning  use  of  estimating  relationships  obtained 
in  this  manner  can  result  in  serious  errors.  The  concepts  underlying 
the  cost-quantity  relationship  generally  known  as  the  learning  curve 
are  presented  along  with  instructions  for  its  use.  Finally,  the  prob¬ 
lem  of  uncertainty  in  cost  estimating  is  discussed  and  a  few  suggestions 
for  dealing  with  the  problem  are  included. 
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I.  COST  ESTIMATING  METHODS 


A  cost  estimate  is  a  Judgment  or  opinion  regarding  the  cost  of  an 
object,  commodity,  or  service.  This  judgment  or  opinion  may  be  arrived 
at  formally  or  informally  by  a  variety  of  methods,  all  of  which  are 
based  on  the  assumption  that  experience  is  a  reliable  guide  to  the 
future.  In  some  cases  the  guidance  is  clear  and  unequivocal,  c.g.: 
bananas  cost  $, 15/lb  last  week;  one  estimates  they  will  cost  about 
$. 15/lb  next  week,  barring  unforeseen  circumstances  such  as  a  freeze 
in  Guatemala.  At  a  slightly  more  sophisticated  level  average  costs 
a:  i  ilculated  and  used  as  factors  to  estimate  the  cost  to  excavate 
a  cubic  yard  of  earth,  to  fly  an  airplane  for  an  hour,  to  drive  an 
automobile  a  mile,  etc.  Much,  perhaps  most,  estimating  is  of  this 
general  type,  that  is,  where  the  relationship  between  past  experience 
and  future  application  is  fairly  direct  and  obvious. 

The  more  interesting  problems,  however,  are  those  where  this  re¬ 
lationship  is  unclear  because  the  proposed  item  is  different  in  some 
significant  way  from  its  predecessors.  The  challenge  to  cost  analysts 
concerned  with  military  hardware  is  to  project  from  the  known  to  the 
unknown,  to  use  experience  on  existing  equipment  to  predict  the  cost 
of  next -generation  missiles,  aircraft  and  space  vehicles.  The  challenge 
is  not  only  in  new  equipment  designs,  since  new  materials,  new  produc¬ 
tion  processes,  and  new  contracting  procedures  also  add  to  the  uncer¬ 
tainty.  Such  Innovations  are  frequently  accompanied  by  an  anticipation 
of  cost -reduction,  and  these  expectations  have  to  be  carefully  evaluated. 

The  techniques  used  for  estimating  hardware  costs  range  from  in- 
iulflon  at  one  extreme  to  a  detailed  application  of  labor  and  material 
cost  standards  at  the  other.  The  Air  Force  Cost  Estimating  Manual 
(AFSC  Manual  173-1)  lists  five  basic  estimating  methods--industri.il 
engineering  standards;  rates,  factors  and  catalog  prices;  estimating 
relationships;  specific  analogies;  and  expert  opinion.  Other  sources 
put  the  number  at  two  (synthesis  and  analysis),  three  (round-table 
estimating,  estimating  by  comparison,  and  detailed  estimating)  or  four 
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(analytical  appraisal,  comparative  analysis,  statistical,  and  standards). 
In  thi.‘  hapter  we  shall  not  attempt  to  be  comprehensive  but  will  limit 
our  discussion  to  three  technlques--the  Industrial  engineering  approach, 
analogy,  and  the  statistical  approach--and  it  Is  the  latter  that  we  will 
be  primarily  concerned  with  throughout  the  remainder  of  the  book. 

Estimating  by  industrial  engineering  procedures  can  be  broadly 

defined  as  an  examination  of  separate  segments  of  work  at  a  low  level 

of  detail  and  a  synthesis  of  the  many  detailed  estimates  into  a  total. 

In  the  statistical  approach,  estimating  relationships  using  explanatory 

variables  such  os  weight,  speed,  power,  frequency  and  thrust  are  relied 

* 

upon  to  predict  cost  at  a  higher  level  of  aggregation.  Figure  I-i 
illustrates  this  difference  in  level  of  detail.  At  the  lowest  level  of 
detail  the  estimator  begins  with  a  set  of  drawings  and  specifies  each 
engineering  or  production  operation  that  will  be  required,  the  work 
stations  where  each  operation  will  be  performed,  and  the  labor  and 
material  required.  This  is  sometimes  referred  to  as  "grass-roots"  or 
"bottom-up"  estimating. 

Figure  1-2  illustrates  the  detail  required  at  the  lowest  level  of 
estimating,  in  this  case  for  forming  a  center  bracket  of  steel  plate. 

The  name  and  number  of  the  operations  and  the  machines  that  will  be 
used  are  given  along  with^ estimates  of  the  setup  time  and  operating 
labor  cost.  Standard  setup  and  operating  costs  are  u3ed  in  making  the 
estimates  wherever  these  exist,  but  if  standards  have  not  been  estab¬ 
lished,  as  is  frequently  the  case  in  the  aerospace  industry,  a  detailed 
study  is  made  to  determine  the  most  efficient  method  of  performing  each 
operation.  A  standard  may  be  a  "pure"  standard  or  an  "attainable" 
standard,  but  essentially  for  some  specified  condition  it  is  the  mini¬ 
mum  time  required  to  complete  a  given  operation  and,  theoretically, 
should  be  approached  asymptotically  when  the  planned  production  rate  is 
attained. 

*Statistical  estimating  is  sometimes  defined  as  a  statistical 
extrapolation  to  produce  an  estimate -at -completion  after  some  progress 
has  been  made  on  a  job  and  costs  or  commitments  experienced.  This  is 
not  the  sense  in  which  the  term  is  used  here. 


Fig.  I -1— Levels  of  aggregation  for  estimating  purposes 
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Fig.  1-2  — Detailed  labor  cost  estimate 
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Standards  arc  not  widely  used  in  the  aerospace  industry  for  estl- 
mating  costs.  They  are  best  applied  where  a  long,  stable  production 
run  of  Identical  items  is  envisaged,  whereas  the  emphasis  in  this  in¬ 
dustry  is  on  development  rather  than  production.  The  Gemini  program 
provides  an  extreme  example  of  this--12  spacecraft  of  varying  con¬ 
figurations  were  developed  and  produced  at  a  cost  of  about  $700  million. 
Other  examples  would  be  Less  dramatic,  but  it  is  generally  true  that 
compared  to  other  industries  production  runs  of  advanced  military  and 
space  hardware  tend  to  be  short  and  that  both  design  configurations 
and  production  processes  may  continue  to  evolve  even  after  several 
hundred  units  have  been  completed.  This  means  that  standards  are  con¬ 
tinually  changing- -one  standard  applies  at  unit  50,  another  at  other 
production  quantities.  Because  the  changes  are  unpredictable,  it  is 
difficult  to  establish  standards  in  advance  of  production  experience 
that  will  be  applicable  at  some  specified  production  quantity, 

Industrial  engineering  estimating  procedures  require  considerably 
more  personnel  and  data  than  are  likely  to  be  available  to  government 
agencies  under  any  foreseeable  conditions.  One  of  the  largest  aero¬ 
space  firms  figures  that  to  estimate  the  cost  of  m  airframe  using  this 
approach  about  4500  estimates  are  required,  and  for  this  reason  It 
avoids  making  industrial  engineering  estimates  whenever  possible.  They 
take  too  much  time  and  are  costly  during  a  period  of  limited  funds  for 
both  contractor  and  government.  Moreover,  for  many  purposes  they  have 
been  found  to  be  less  accurate  than  estimates  made  statistica  ny. 

One  reason  for  this  is  simply  that  the  whole  generally  turns  out  to  be 
greater  than  the  sum  of  4500  parts.  The  detail  estimator  works  under 
the  same  disadvantages  as  do  all  other  estimators  before  an  item  has 
been  produced.  Working  from  sketches,  blueprints,  or  word  descriptions 
of  some  item  that  has  not  been  completely  designed,  he  can  assign  costs 
only  to  work  that  he  knows  about.  (An  attempt  is  sometimes  made  to 
estimate  how  complete  the  work  statement  is  and  this  estimate  becomes  a 
factor  to  apply  to  the  detail  estimate,  e.g.,  the  work  statement  is 


They  are  used  extensively  for  other  purposes,  however,  such  as 
control  of  shop  performance. 
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estimated  to  be  50  percent  complete,  so  the  detail  estimate  Is  multiplied 
by  two.)  The  effect  of  a  low  estimate  here  is  compounded  because  detail 
estimating  is  normally  attempted  only  on  a  portion  of  production  labor 
hours.  A  number  of  production  labor  elements,  such  as  rework,  planning 
time,  coordination  effort,  etc.,  are  usually  factored  in  as  percentages 
of  the  detail  estimate.  Then,  other  cost  elements,  such  as  sustaining 
effort,  tool  maintenance,  quality  control  and  manufacturing  research,  • 
are  factored  in  as  percentages  of  production  labor,  Thus,  small  errors 
in  the  detaiL  estimate  can  result  in  large  errors  in  the  total. 

A  second  reason  has  already  been  suggested.  This  is  the  view  that 
significant  variability  in  the  fabrication  and  ..ssembly  of  successive 
production  units  is  and  will  continue  to  be  characteristic  of  the  in¬ 
dustry.  Production  runs  of  like  models  tend  to  be  of  limited  length 
and  to  be  characterized  by  numerous  design  changes.  In  the  case  of 
military  aircraft,  production  rates  have  tended  to  vary  frequently  and 
at  times  unexpectedly.  The  proportion  of  new  components  in  equipment 
is  probably  higher  in  the  airframe  industry  than  in  any  other.  The 
effect  of  these  factors  can  be  represented  statistically  by  the  learning 

ic 

or  progress  curve  so  characteristic  of  this  industry.  One  set  of  fab¬ 
rication  and  assembly  modes  is  succeeded  by  more  efficient  production 
functions,  thus  lowering  the  total  labor  requirement.  The  introduction 
of  engineering  changes  causes  discontinuities  in  this  process  but  does 
not  interfere  with  the  general  trend.  If  new  manufacturing  processes 
and  techniques  are  introduced,  these  may  cause  changes  in  past  relation¬ 
ships.  History,  however,  seems  to  show  that  changes  in  manufacturing 
and  managements  techniques,  while  they  may  have  dramatic  impacts  in 
circumscribed  areas,  tend  to  result  in  only  gradual  changes  over  the 
entire  process. 

Because  a  private  concern  generally  has  data  only  on  its  own 
products,  much  of  the  estimating  in  industry  is  based  on  analogy, 
particularly  when  a  firm  is  venturing  into  a  new  area.  In  the  1950s, 
for  example,  aircraft  companies  bidding  on  ballistic  missile  programs 
drew  analogies  between  aircraft  and  missiles  to  develop  estimates  for 

it 

Discussed  in  Chapter  VI. 
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the  latter,  Douglas  Aircraft  Company  (Now  McDonnel 1 -Douglas)  made  a 
good  estimate  on  the  Thor  intermediate  range  ballistic  missile  by  com¬ 
paring  Thor  with  the  DC-4  transport  airplane,  The  same  company  later, 
and  less  Successfully,  based  its  estimates  of  the  Saturn  S-IV  stage 
on  its  Thor  experience,  adjusting  for  differences  in  sise,  the  number 
of  engines,  higher  performance,  and  Insulation  problems  (the  need  to 
cope  with  liquid  hydrogen  as  well  as  liquid  oxygen). 

At  all  levels  of  aggregation  much  estimating  is  of  this  type-- 
System  A  required  100,000  hours;  given  the  likenesses  and  differences 
in  design  and  performance  of  proposed  System  B  the  requirement  for  B 
is  estimated  to  be,  say,  120,000  hours.  Or,  at  a  different  level, 
engineers  and  shop  foremen  may  rely  on  analogies  when  making  a  grass¬ 
roots  estimate,  and  in  this  event  analogy  becomes  part  of  the  industrial 
engineering  approach.  The  major  drawback  to  estimating  by  analogy  is 
that  it  is  essentially  an  intuitive  process,  and  as  a  consequence  re¬ 
quires  considerable  experience  and  Judgment  to  be  done  successfully. 

Thus,  while  statistical  procedures  are  preferable  in  most  situations, 
there  are  circumstances  where  analogy  or  industrial  engineering  techniques 
are  required  because  the  data  do  not  provide  a  syatematic  historical  basis 
for  estimating  cost  behavior.  It  may  be  that  a  new  item  is  to  be  con¬ 
structed  of  some  unfamiliar  material,  or  chat  some  design  consideration 
is  so  radically  different  that  statistical  procedures  are  inadequate, 

The  employment  of  new  structural  material  for  aircraft  often  requires 
the  development  of  special  cutting  and  forming  techniques  with  signif¬ 
icantly  different  manufacturing  labor  requirements  than  those  projected 
from  a  sample  of  essentially  aluminum  airframes.  Faced  with  this  problem 
on  titanium,  airframe  companies  developed  standard-hour  values  for  tita¬ 
nium  fabrication  on  the  basis  of  shop  experience  fabricating  test  parts 
and  sections.  Ratios  of  these  values  to  those  for  comparable  operations 
on  aluminum  aircraft  were  prepared  and  these  ratios  used  In  existing 
statistical  estimating  relationships.  Thus,  while  industrial  engineer¬ 
ing  procedures  are  used  to  provide  input  data,  the  approach  remained 
statist ical . 

Another  exception  occurs  in  the  case  of  industrial  facilities. 
Requirements  for  these  cannot  be  estimated  without  knowing  the  contrsc- 
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tor's  identity  and  the  extent  and  availability  of  hie  existing  plant. 
Consequently,  facilities  cost  must  be  estimated  from  information  avail¬ 
able  for  each  specific  case. 

There  will  always  be  exceptions  of  this  kind,  but  in  general  the 
Statistical  approach  is  useful  in  a  wide  range  of  contexts,  e.g. , 
whether  the  purpose  is  long-range  planning  or  contract  negotiation. 

In  the  former  a  more  highly  aggregated  procedure  may  be  used,  because 
it  ensures  comparability  when  little  detailed  knowledge  about  the  equip¬ 
ment  is  available.  Total  hardware  cost  may  be  estimated  as  a  function 
of  one  or  more  explanatory  variables,  e.g,,  engine  cost  as  a  function 
of  thrust  or  transmitter  cost  as  a  function  of  power  output  and  fre¬ 
quency,  but  this  is  often  a  matter  of  necessity,  not  choice,  Even  for 
long-range  planning,  it  Is  sometimes  desirable  to  estimate  in  some  detail. 

To  say  that  statistical  techniques  can  be  used  in  a  variety  of 
situations  does  not  imply  that  the  techniques  are  the  same  for  ail 
aituatlons.  They  will  vary  according  to  the  purpose  of  the  study  and 
the  information  available.  In  a  conceptual  study  it  is  necessary  to 
have  a  procedure  for  estimating  the  total  expected  costs  of  a  program, 
and  this  must  include  an  allowance  for  the  contingencies  and  unfore¬ 
seen  changes  chat  seem  to  be  an  inherent  part  of  most  development  and 
production  programs. 

Similarly,  a  long-range  planning  study  would  use  industry-wide 
labor  and  burden  rates  and  an  estimated  learning  curve  slope,  while 
later  in  the  acquisition  cycle  data  that  is  specific  for  a  particular 
contractor  In  a  particular  location  can  be  used.  In  effect  this  merely 
states  the  obvious--that  as  more  is  known,  fewer  assumptions  are  re¬ 
quired.  When  enough  is  known,  and  this  means  when  a  product  is  well 
into  production,  accounting  type  Information  and  data  can  be  taken 
directly  from  records  of  account  and  used  with  a  minimum  of  statistical 
manipulation.  This  technique  is  useful  only  in  those  cases  where  the 
future  product  or  activity  under  consideration  la  essentially  the  same 
(both  in  terms  of  configuration  and  scale  of  production  or  operation) 
as  that  for  the  past  or  current  period. 

In  any  situation  the  estimating  procedure  to  be  used  should  be 
determined  by  (l)  the  data  available,  (2)  the  purpose  of  the  estimate, 
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and  (3)  to  a  lesser  extent  by  less  relevant  factors  such  as  the  time 
available  to  make  an  estimate.  The  essential  Idea  we  wish  to  convey 
In  this  chapter  is  that,  when  properly  applied,  statistical  procedures 
are  varied  and  flexible  enough  to  be  useful  in  most  situations  defense 
equipment  cost  analysts  arc  likely  to  face.  While  no  specified  set  of 
procedures  can  guarantee  accuracy,  decisions  must  be  made  and  It  19  es¬ 
sential  that  they  be  made  on  the  best  possible  information.  What  wo 
are  seeking  here  are  the  approaches  which  will  give  the  best  possible 
answers,  given  the  basic  information  that  Is  available. 
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II.  DATA  COLLECTION  AND  ADJUSTMENT 

The  government  has  been  collecting  cost  and  program  data  on  Weapon 
and  support  systems  for  many  years,  sometimes  In  detail,  sometime*  in 
highly  aggregated  form,  but  always  In  quantity.  As  a  consequence,  it 
Is  a  little  bit  surprising  that  when  an  estimating  Job  comes  along,  the 
right  data  seldom  seem  to  be  at  hand.  One  can  speculate  about  why  this 
should  be,  but  in  our  opinion  the  essential  reason  Is  that  the  needs  of 
cost  analysis  have  not  always  been  considered  in  designing  the  many  In¬ 
formation  systems  that  have  been  used  over  the  years  by  the  Army,  Navy 
and  Air  Force.  Data  have  been  collected  primarily  for  program  controL, 
for  program  management  and  for  program  audit,  but  this  type  of  Information 
was  never  systematically  processed  and  stored  Instead,  after  a  couple 
of  years  it  has  generally  been  discarded  or  stored  in  not  readily  ac¬ 
cessible  warehouses.  Moreover,  the  data  were  inconsistent  since  they 
were  gathered  according  to  the  requirements  of  each  Service  and  each 
program  manager,  As  a  consequence,  to  obtain  the  kind  of  data  neces¬ 
sary  to  develop  estimating  techniques,  the  analyst  has  had  to  go  back 
to  the  contractor's  records, 

With-the- Inst itutibn  oPClR  (Cost  Information  Report)  in  1966,  the 
situation  should  greatly  change.  This  report  was  designed  to  collect 
costs  and  related  data  on  aircraft,  missile  and  space  systems  and  their 
related  components  for  the  purpose  of  assisting  both  industry  and  govern¬ 
ment  in  estimating  and  analyzing  the  costs  of  these  items.  Information 
from  other  soirces--contract  records,  CFE  records,  and  the  like--can 
be  processed  and  spliced  to  CIR  as  It  becomes  available.  Hence,  over 
•  period  of  years,  as  data  are  accumulated,  the  need  for  ad  hoc  col¬ 
lection  efforts  should  diminish.  These  efforts  will  never  disappear 
completely,  however,  Since  it  will  never  be  possible  to  rely  on  CIR 
alone  (or  on  any  foreseeable  Information  system)  because  it  will  not 
apply  to  all  new  hardware  and  will  not  provide  all  the  cost  information 
that  might  ever  be  required  on  the  hardware  it  does  cover,  the  subject 
of  data  collection  is  still  one  with  which  cost  analysts  must  be  concerned. 
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In  the  befit  of  all  possible  worlds  the  analyst  would  have  such  a 
wealth  of  data  that  he  could  develop  estimating  techniques  responsive 
to  any  demand.  Such  a  world  is  unknown  in  the  aerospace  industry  where 
even  the  largest  contractors  are  reluctant  to  allocate  the  resources 
required  to  put  estimators  In  such  a  favorable  position,  A  government 
estimator  is  better  placed  in  some  regards,  l.e.,  he  haa  a  much  broader 
base  of  experience  to  draw  upon,  but  he  lacks  the  detail  an  industry 
estimator  has  on  his  own  company's  products.  Data  collection  is  ex¬ 
pensive;  hence,  the  estimator  is  generally  in  the  position  of  having 
less  than  he  wants  and  of  having  to  design  techniques  to  fit  the  data 
he  has  been  able  to  accumulate, 

Some  minimum  data  requirement  exists  for  any  given  Job,  however, 
and  before  data  collection  begins  the  analyst  must  consider  the  scope 
of  his  problem,  define  generally  what  he  wants  to  do,  and  decide  how 
he  is  going  tu  do  it.  The  data  required  to  estimate  equipment  costs  for 
a  long-range  planning  study  can  be  substantially  less  than  those  needed 
to  prepare  an  Independent  cost  estimate  for  contract  negotiation.  In 
the  former,  total  equipment  coBts  may  suffice  while  in  the  latter  costs 
must  be  collected  at  the  level  of  detail  in  which  the  contract  is  to 
be  negotiated.  For  major  items  this  means  a  functional  breakout,  e.g., 
direct  labor,  materials,  engineering,  tooling,  etc.  One  could  pos¬ 
tulate  problems  requiring  oven  a  greater  amount  of  detail;  suppose, for 
example,  that  two  similar  hardware  Items  had  substantially  different 
costs.  Only  by  examining  the  cost  detail  could  this  be  explained, 

In  performing  this  initial  appraisal  of  the  Job  the  analyst  will 
be  greatly  aided  by  a  thorough  knowledge  of  the  kind  of  equipment  with 
which  he  will  be  deallng--its  characteristics ,  the  state  of  its  tech¬ 
nology,  and  the  available  sample.  With  this  knowledge  he  can  determine 
what  types  of  data  are  required  and  available  for  what  he  wonts  to  do, 
where  the  data  are  located,  and  what  types  of  adjustments  may  be  re¬ 
quired  to  make  the  collected  data  base  consistent  and  comparable.  Only 
after  the  problem  has  been  given  this  general  consideration  should  one 
begin  the  task  of  data  collection, 

This  is  an  important  point.  All  too  often  a  mountain  of  data  is 
collected  with  little  thought  as  to  how  it  is  going  to  be  used.  The 
result  is  that  some  portion  may  be  unnecessary,  unusable,  or  not  com- 
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pletcly  understood.  Data  collection  Is  generally  the  most  troublesome 
and  time-consuming  part  of  any  cost  analysis.  Consequently,  careful 
planning  In  this  phase  of  the  overall  effort  Is  well  worthwhile. 

To  develop  a  cost -estimating  procedure,  at  least  three  different 
types  of  historical  data  are  required,  first,  there  are  the  resource 
data,  usually  In  the  form  of  expenditures  or  labor  hours.  It  Is  cus¬ 
tomary  to  apply  the  word  cost  to  both,  and  that  practice  is  followed 
throughout  this  chapter,  A  second  type  of  data  describes  the  possible 
cost-explanatory  elements;  for  hardware  such  as  aircraft  and  missiles 
this  means  performance  and  physical  characteristics.  The  third  type 
is  program  data,  i.e.,  information  related  to  the  development  and 
production  history  of  the  hardware  item. 

Resource  Data 

Resource  data  are  generally  classified  into  end-item  categories 
or  functional  categories.  An  example  of  the  former  in  some  of  the 
various  possible  levels  of  detail  would  be: 

System 

Subsystem 

Component 

Part 

The  functional  categories  are  engineering,  tooling,  manufacturing, 
quality  control,  purchased  equipment,  etc.,  and  typically  these  are 
further  broken  down  into  labor,  material,  overhead,  and  other  direct 
charges.  The  fountainhead  of  resource  data  is  the  contractor's  plant. 
While  the  accounting  systems  will  vary  from  one  company  to  another,  in 
general  the  amount  of  detail  Is  immense.  A  typical  airframe  company, 
for  example,  sets  up  the  production  process  on  the  basis  of  a  number 
of  different  Jobs  or  stations,  each  identified  by  a  number  or  symbol. 

All  manufacturing  direct  labor  and/or  material  (depending  on  the  type 
of  coat  accounting  system)  expended  on  a  given  Job  Is  recorded  on  a 
Job  order  or,  aa  is  becoming  increasingly  more  common,  fed  directly 
into  a  computer.  Where  such  a  system  is  used,  the  actual  hours  incurred 
for  every  operation  are  available  to  management;  and  these  costs  can  be 
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aggregated  as  needed.  They  cannot  generally  be  attributed  to  a  single 
unit,  however,  and  some  elements  of  cost,  e.g.,  tooling  and  engineering, 
are  not  even  Identifiable  by  lot.  And  since  different  contractors  do 
the  work  differently,  they  will  have  different  job  orders,  This  means 
in  practice  that  data  at  more  detailed  levels  may  not  be  comparable 
from  one  contractor  to  another,  Also,  detailed  information  of  this 
kind  is  unnecessary  for  most  government  estimating  and,  as  a  consequence, 
is  rarely  sought. 

Parenthetically,  it  can  be  said  that  if  there  were  a  need  to  es¬ 
timate  in  more  detail,  the  data  required  would  increase  by  an  order  of 
magnitude  or  more,  and  data  processing  equipment  would  become  a  virtual 
necessity.  The  question  of  when  to  incorporate  automatic  data  process¬ 
ing  techniques  into  the  data  collection  effort  hinges  primarily  on  the 
volume  of  data  to  be  handled.  The  trend  in  the  aerospace  industry  is 
to  rely  more  and  more  on  computers  for  internal  data  needs,  and  for 
some  purposes  data  have  been  provided  to  the  government  on  punch  cards 
or  magnetic  tape.  Thus,  there  are  no  technical  reasons  why  cost  data 
could  not  be  obtained  in  this  form  should  it  be  more  convenient  to  the 
cost  analyst,  but  as  mentioned  earlier,  there  are  good  reasons  not  to 
use  excessive  detail  even  if  it  is  readily  available--expense  increases 
and  accuracy  is  likely  to  decrease. 

Theoretical  considerations  apart,  the  hard  truth  is  that  estimat¬ 
ing  techniques  must  be  based  on  the  resource  data  the  analyst  can  lay 
his  hands  on,  and  in  the  past  the  availability  of  data  has  varied 
greatly  from  one  type  of  equipment  to  another.  As  an  illustration  of 
this,  aircraft  estimating  procedures  tend  to  be  different  from  those 
developed  for  missiles  and  spacecraft.  An  airframe  model  may  contain 
the  following  cost  elements: 

Initial  and  sustaining  engineering 
Development  support 
Flight  test  operations 
Initial  and  sustaining  tooling 
Manufacturing  labor 
Manufacturing  material 
Quality  control 


A  list  of  cost  elements  something  like  this  is  desirable  for  all  hard- 
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ware  estimating,  but  because  of  data  limitations,  present  procedures 
for  engines  often  include  only  two  cost  categcr!.es--development  and 
producticn--and  avionics  procedures  only  one- -procurement  cost  to  Che 
government.  CIR  should  expand  the  possibilities  in  the  future, 

Performance  and  Physical  Characteristics 

Information  about  the  physical  and  performance  characteristics  of 
aircraft,  missile  and  space  systems  is  just  as  important  e.s  resource 
data.  This  means  that  data  collection  in  this  area  can  be  time-consum¬ 
ing,  particularly  since  it  is  seldom  clear  in  advance  what  the  neces¬ 
sary  data  will  be.  The  goal,  of  course,  Is  to  obtain  a  list  of  those 
characteristics  which  best  expLain  differences  in  cost.  Weight  is  the 
most  commonly  used  explanatory  variable,  but  weight  alone  is  seldom 
enough.  For  airfrj  res,  speed  is  almost  always  included  as  a  second 
explanatory  variable,  and  one  estimating  procedure  for  aircraft  uses 
all  of  the  following: 

Maximum  speed  at  optimal  altitude 

Maximum  speed  at  sea  level 

Year  of  first  delivery 

Total  airframe  weight  ~  - 

Increase  in  airframe  weight  from  unit  1  to  unit  n 

Weight  of  installed  equipment 

Engine  weight 

Electronics  complexity  factor 

In  addition,  the  following  characteristics  were  considered,  but  not 
used : 

Maximum  rate  of  climb 
Maximum  wing  loading 
Empty  weight 
Maximum  altitude 
Design  load  factor 
Maximum  range 
Maximum  payload 

*Methods  of  Estimating  Fixed-Wing  Airframe  Costs.  Vol.  I,  Planning 
Research  Corporation,  PRC  R-547,  1  February  1965. 


-15- 


At  the:  outset  of  a  study  to  develop  an  estimating  relationship  for 
aircraft  costs,  the  cost  analyst  would  not  know  which  of  all  these  char¬ 
acteristics  would  prpvide  the  best  explanation  of  variations  among  the 
co*t  of  different  aircraft  and  would  try  to  be  as  comprehensive  as  pos¬ 
sible.  An  analyst  who  is  familiar  with  the  type  of  hardware  under  study 
should  have  some  idea  of  what  the  most  likely  candidates  are,  but  he 
will  generally  consider  more  characteristics  than  will  eventually  be 
used . 


Program  Data 

A  third  type  of  essential  data  is  drawn  from  the  development  and 
production  history  of  hardware  items,  The  acceptance  date  of  the  item, 
the  significant  milestones  in  the  development  program,  the  production 
rates,  and  the  occurrence  of  major  and  minor  modifications  in  its  pro¬ 
duction-  -  informat  ion  such  as  this  can  contribute  to  the  development 
of  meanlngfuL  cost-estimating  relationships.  It  will  be  noted  that 
the  list  of  explanatory  variables  in  the  previous  section  includes 
year  of  first  delivery  and  increase  in  airframe  weight  from  unit  1  to 
unit  r.,  information  that  would  be  included  in  the  category  program 
data. 

An  airframe  typically  changes  in  weight  during  both  development 
and  production  as  a  result  of  engineering  changes.  For  example,  the 
weight  of  the  F-4D  varied  as  follows: 


Cumulative 
Plane  Number 

1-  11 

12-186 

187-241 

242-419 


Airframe 


Unit  Wt  (lb) 
8456 
894] 

8541 

9193 


Since  labor  hours  are  commonly  associated  with  weight  to  o  tain  hours- 
per-pound  factors,  :.t  is  important  to  have  the  weights  correct  and  not 
Co  use  a  single  weight. 

The  need  for  other  kinds  of  program  data  will  be  made  clear  by 
the  following  pages  on  data  adjustment.  To  cite  one  example  here,  one 
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needs  to  know  the  year  In  which  expenditures  occur  to  adjust  cost  data 
for  price  level  changes.  (This  is  the  reason  for  at  least  one  CIR  sub¬ 
mission  annually.)  A  certain  amount  of  what  we  have  chosen  to  call 
program  data  cannot  be  specified  this  definitely  nor  can  its  use  be 
foretold,  but  it  is  important  nonetheless.  This  is  what  might  be 
termed  background  information--information  about  what  else  is  going 
on  in  the  contractor's  plant  at  the  time  a  particular  hardware  item  is 
being  built,  unusual  problems  the  contractor  may  be  encountering,  at¬ 
tempts  to  compress  or  stretch  out  the  program,  inefficiencies  noted, 
etc.  These  facts  may  be  useful  in  explaining  what  appear  to  be  aber¬ 
rations  when  the  resource  data  are  compared  with  those  from  other 
development  and  production  programs.  In  addition  a  history  of  a  con¬ 
tractor's  overhead,  G&A,  and  labor  rates  is  useful  both  for  analyzing 
and  predicting  costs. 

DATA  ADJUSTMENT 

To  be  usable  to  the  cost  analyst  data  must  be  consistent  and 
comparable,  and  in  most  cases  the  data  as  collected  are  neither.  Hence, 
before  estimating  procedures  can  be  derived  the  data  have  to  be  adjusted 
for  such  things  as  price  level  changes,  definitional  differences,  pro¬ 
duction  quantity  differences,  and  so  on.  This  section  discusses  some 
of  the  more  common  adjustments.  It  is  by  no  means  an  exhaustive  treat¬ 
ment  of  the  subject,  since  the  list  of  possible  adjustments  is  long  and 
many  of  them  will  apply  only  in  a  very  small  number  of  cases.  Also, 
evidence  on  certain  types  of  ad justments--for  contractor  efficiency, 
for  contract  type,  for  program  stretch-out,  etc. --consists  largely  of 
opinion  rather  than  hard  data  and  while  we  can  allude  to  such  adjust¬ 
ments  the  research  necessary  to  treat  them  in  some  definitive  way  has 
not  yet  been  done. 

Definitional  Differences 

Different  contractor  accounting  practices  are  one  of  the  primary 
reasons  that  adjustment  of  the  basic  cost  data  is  generally  necessary. 
Companies  record  their  costs  in  different  ways,  are  often  required  to 
report  costs  to  the  government  by  categories  somewhat  different  from 
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those  used  Internally,  and  the  reporting  categories  change  from  time 
to  time.  Because  of  these  definitional  differences,  one  of  the  first 
steps  in  any  cost  analysis  is  to  state  the  definition  that  is  being 
used  iind  then  adjust  all  data  to  this  one  definition.  With  the  in¬ 
ception  of  CIR,  a  standard  set  of  definitions  for  airframes  has  been 
established  for  use  throughout  the  Department  of  Defense.  A  primary 

purpose  of  CIR  is  to  overcome  the  problem  of  definitional  differences 
in  hardware  cost  data.  For  the  next  few  yoars,  however,  when  most 
data  will  antedate  CIR,  some  adjustment  will  be  required. 

As  ar.  example  of  what  may  be  expected,  a  cost  analyst  may  be 
examining  data  from  a  sample  of  10  hardware  items  and  discover  that 
the  cost  element  Quality  Control  is  missing  for  some  of  the  earlier 
items.  He  may  conclude  that  no  quality  control  was  exercised  back  in 
the  1950's  or  that  this  function  is  included  in  some  other  cost  element. 
The  latter  is  correct  of  course.  Traditionally,  Quality  Control  was 
carried  in  the  burden  account,  and  it  was  only  in  the  late  1950's  that 
it  began  to  appear  (at  the  request  of  the  Department  of  Defense)  as  a 
separate  element.  Hence  to  use  cost  data  on  equipment  built  prior  to 
this  change  some  portion  of  overhead  cost  has  to  be  converted  to  Quality 
Control. 

A  more  current-  example  involves  Planning,  which  in  the  CIR  defi¬ 
nition  is  includetTl.n  Tooling.  Planning  consists  of  two  components-- 
tool  planning  and  production  planning — so  some  companies  put  the  first 
in  Tooling  and  the  second  in  Manufacturing.  Other  practices  are  to 
include  tool  planning  in  Engineering,  to  put  all  planning  in  Manufac¬ 
turing,  or  to  include  some  portion  in  Overhead.  In  our  view  the  CIR 
definition  is  the  most  logical. 

Table  II- 1  illustrates  this  problem  more  concretely.  On  the  left 
is  a  slight!)  abbreviated  version  of  the  CIR  list  of  cost  elements; 
o:i  the  right  are  the  categories  used  by  a  large  aerospace  company  and 
t he  non-recurring  costs  of  a  proposed  airframe.  The  lists  are  differ¬ 
ent  and,  as  shown  by  TabLe  II-2,  a  simple  rearrangement  of  the  contractor 
cost  elements  does  not  solve  the  adjustment  problem. 

After  this  rearrangement  four  of  the  contractor  cost  elements-- 
Developmental  Material  ($2.6  million),  Outside  Production  ($70,000), 

Other  Direct  Charges  ($2.7  million),  and  Manufacturing  Overhead 
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Material  overhead 


Table  11-2 


CONTRACTOR  COST  ELEMENTS  ARRANGED  IN  CIR  FORMAT 


Cost  (Thousands  of  $) 

CIR  Cost  Element 

Contractor  Cost  Element 

In  House 

Outside 

Production 

1.  Engineering 

Direct  labor 

Overhead 

Material 

Other  direct  charges 

Engineering 

Engineering  overhead 

2.  Tooling 

Direct  labor 

Overhead 

Materials  and  pur¬ 
chased  tools 

Other  direct  charges 

Tooling  direct  labor 

Tooling  material 

11,600 

2,600 

3.  Quality  control 

Direct  labor 

Overhead 

Other  direct  charges 

Inspection 

620 

4.  Manufacturing 

Direct  labor 

Overhead 

Developmental  direct  labor 
Production  direct  labor 

2,500 

850 

^Materials  and" pur¬ 
chased  parts 

Other  direct  charges 

Production  material 

500 

5.  Purchased  equipment 

Purchased  equipment 

5 

6.  Material  overhead 

---- 

($28.94  mi 11  ion) --remain  to  be  dealt  with.  Since  these  four  categories 
can  amount  to  well  over  half  the  total  cost  of  a  large  production  con¬ 
tract,  we  are  not  talking  about  trivial  adjustments.  Developmental 
Material  presumably  would  be  split  between  Engineering  Material  and 
Manufacturing  Material;  Other  Direct  Charges  would  have  to  be  allocated 
among  Engineering,  Tooling,  Quality  Control  and  Manufacturing;  and  part 
of  Manufacturing  Overhead  would  be  apportioned  to  Tooling  Overhead  and 
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Quality  Control  Overhead.  tn  each  of  these  Instances  the  contractor 
furnishing  Cost  Information  Reports  would  he  able  to  make  the  necessary 
adjustments  from  hts  own  accounting  records.  Outside  Production  costs, 
although  small  in  this  example,  in  some  cases  may  comprise  30  to  40 
percent  Of  the  total  cost  of  an  airframe.  Where  this  is  the  case,  the 
labor  hours  and  material  costs  incurred  by  the  prime  contractor  fall 
far  short  of  the  total  required  to  build  an  airplane,  and  some  method 
of  arriving  at  a  total  must  be  devised.  Ordinarily,  the  contractor 
would  have  a  detailed  breakout  of  costs  only  for  subcontractors  on  cost- 
reimbursable  contracts!  and  other  Outside  Production  costs  would  have  to 
be  allocated  to  the  specified  categories.  Production  labor  hours  in¬ 
curred  out-of-plant,  for  example,  are  often  estimated  on  the  basis  of 
the  weight  of  that  portion  of  the  airframe  being  built  out  of  plant. 

In  using  historical  data,  the  analyst  may  be  in  a  similar  position  oc¬ 
casionally,  and  where  the  amounts  Involved  are  large,  he  should  be 
guided  by  whatever  information  the  contractor  can  provide, 


Physical  and  Performance  Characteristics 

A  problem  similar  to  the  one  discussed  above  concerns  the  need 
for  consistency  in  definitions  of  physical  ^nd  performance  character- — 
istics.  "Speed,"  for  example,  can  be  defined  in  many  ways --maximum 
speed  at  optimal  altitude,  true  speed,  equivalent  speed,  indicated 
speed,  etc. --which  differ  in  exact  meaning  and  value.  Hie  weight  of 
an  aircraft  or  missile  depends  on  what  is  included.  Cross  weight, 
empty  weight  and  airframe  unit  weight  are  all  used  for  aircraft. 

Some  agencies  Include  sweep  volume  in  their  definition  of  the  physical 
volume  of  an  aircraft  fire  control  system;  others  exclude  it.  Examples 
of  this  kind  are  numerous,  but  the  point  hardly  needs  elaboration.  It 
is  raised  here  because  differences  such  as  these  can  lead  an  analyst 
unfamiliar  with  the  equipment  being  Investigated  to  use  inconsistent 
or  varying  values  inadvertently.  When  data  are  being  collected  from  a 
variety  of  sources,  an  understanding  of  the  terms  used  to  describe 
physical  and  performance  character  is  tics  is  at  least  as  important  as 
an  understanding  of  the  content  of  the  various  cost  elements, 
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Nonrecurrlna  and  Recurring  Coats 

Another  problem  hinging  on  the  question  of  definitions  concerns 
nonrecurring  end  recurring  costs.  Recurring  costs  are  a  function  of 
the  number  of  items  produced;  nonrecurring  costs  are  not.  Thus,  for 
estimating  purposes  it  is  useful  to  distinguish  between  the  tvo  and 
CIR  provides  for  this  distinction.  Unfortunately,  historical  cost  data 
frequently  show  such  cost  elements  as  initial  and  sustaining  engineer¬ 
ing  as  an  accumulated  item  in  the  initial  contract,  Various  anslytical 
techniques  have  been  developed  for  dividing  the  total  into  its  two 
components  synthetically,  but  it  is  not  clear  at  this  time  whether 
the  nonrecurring  costs  obtained  by  ex  post  facto  methods  will  be  com¬ 
parable  to  those  reported  in  CIR.  The  CIR  Instructions  state: 

It  is  preferable  to  Identify  the  point  of  segregation  between 
nonrecurring  end  recurring  engineering  costs  as  a  specific 
event  or  point  in  time.  Ideally,  the  event  used  would  be  the 
point  at  which  "design  freeze"  takes  place  as  a  result  of  a 
formal  test  or  Inspection,  and  after  which  formal  Engineering 
Change  Proposal  (ECP)  procedure*  must  be  followed  to  change 
design.  If  no  reasonable  event  can  be  specified  for  this 
purpose,  then  all  engineering  costs  incurred  up  tc>  the  date 
of  90  percent  engineering  drawing  release  may  be  used. 

While  it  would  be  premature  to  consider  the  kinds  of  adjustments 
needed  before  a  body  of  CIR  date  exists,  splicing  historical  data  to 
CIR  data  may  Involve  an  adjustment  of  some  kind. 

A  more  subtle  problem  arises  when  nonrecurring  costs  on  one  prod¬ 
uct  are  combined  with  recurring  costs  on  another,  i.e.,  when  the  con¬ 
tractor  is  allowed  to  fund  development  work  on  new  products  by  charging 
it  off  as  an  operating  expense  agalnat  current  production.  This  prac¬ 
tice  it  especially  prevalent  in  the  aircraft  engine  Industry.  Sepa¬ 
ration  of  the  nonrecurring  and  recurring  costa  in  this  instance  meana 
an  adjustment  of  the  production  coata  shown  in  contract  or  audit  docu¬ 
ment*  to  exclude  any  amortization  of  development.  The  nonrecurring  ex¬ 
pense  which  had  been  amortized  can  then  be  attributed  to  the  item  for 
which  it  waa  Incurred.  This  adjustment  can  only  be  accomplished  in 
cooperation  with  the  accounting  departsMnt  of  the  companies  involved. 

It  would  be  unnecessary,  of  course,  for  equipment  on  which  CIR  dsts 
are  available. 


Figure  H*1  shows  the  change  In  average  hourly  earnings  of  produc- 


declined  slightly  during  the  early  1920's  and  again  during  the  Depres¬ 
sion,  the  trend  has  been  steadily  upward  since  1934.  The  hourly  wage 
rate  has  Increased  by  a  factor  of  4.73  over  a  43-year  period,  or  put 
another  way,  in  1965  a  manufacturer  paid  $4.75  for  labor  that  would  have 
cost  him  $1.00  back  In  1920.  The  implication  of  this  for  equipment 
costs  Is  clear.  If  the  labor  component  of  an  automobile  cost  $500  In 
1920,  the  cost  for  the  same  car  today  would  be  something  over  $2000 
(the  hours  required  In  1965  would  be  lesa  because  of  Increased  produc¬ 
tivity,  but  this  effect  will  be  discussed  later). 

The  relevance  of  these  observations  to  the  subject  of  data  adjust¬ 
ment  is  that  the  manufacturing  date  of  the  different  hardware  Items  In  a 


-23- 


sample  are  normally  spread  over  a  period  perhaps  as  long  as  10  to  15 
years.  To  compare  a  Missile  built  in  1955  when  labor  cost  about  $2,35 
per  hour  with  a  missile  built  10  years  later  when  the  labor  rate  had 
increased  co  over  $3.35  per  hour  the  labor  cost  of  both  must  be  adjusted 
to  a  common  base.  (This  problem  is  obviated  by  dealing  in  hours  rather 
than  dollars  but  an  adjustment  would  still  be  needed  for  raw  material 
and  purchased  parts.)  Adjustments  of  this  kind  are  made  by  means  of  a 
price  index  constructed  from  n  time-series  of  data  by  selecting  one 
year  as  the  base  and  expressing  tire  value  for  that  year  as  100.  The 
other  years  ere  then  expressed  as  percentages  of  this  base.  The  hourly 
earnings  from  1950  to  1960  for  production  workers  could  be  converted 
to  an  index  using  any  of  the  years  as  the  base;  in  the  example  below 
1950  and  1960  have  both  been  used  as  base  years. 


Year 

Average 

Hourly 

Earnings 

Index  with 
1950  as 

Base  Year 

Index  with 
1960  as 
Base  Year 

1950 

$1.44 

100 

64 

1951 

1.56 

108 

69 

1952 

1.65 

115 

73 

1953 

1.74 

121 

77 

1954 

1.78 

124 

79 

1955 

1.86 

129 

82 

1956 

1.95 

135 

86 

1957 

2.05 

142  _ 

91 

- 1958 

2.11 

147 

93 

1959 

2,19 

152 

97 

I960 

2.26 

157 

too 

Information  to  construct  a  labor  index  such  as  this  is  available 
in  the  Bureau  of  Labor  Statistics  publication  Employment  and  Earnings, 
and  Table  1 1 - 3  presents  Indexes  baaed  on  this  source.  Changes  in  mate¬ 
rials  coats  are  available  in  another  BLS  publication.  Wholesale  Prices 
and  Price  Indexes,  and  these  can  be  used  to  develop  a  materials  price 
index  for  a  given  type  of  equipment  by  the  following  simple  procedure. 

A  list  of  materials  representative  of  those  used  in  constructing  the 
equipment  is  chosen  from  the  commodity  groups  in  the  Wholesale  Price 
Index,  and  these  materials  weighted  according  to  estimates  of  the  a- 
mount  of  each  in  fabricating  the  equipment,  A  composite  aircraft  raw 
materials  index  might  be  baaed  on  the  following  materials  and  weight*: 
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Table  tl-3 
LABOR  PRICE  INDEX 


Tear 

1 

i 

Aircraft 

Aircraft 

Engines 

and 

Engine 

Parcs 

L  •  ■  -  . 

Other 
Aircraft 
Parts  and 
Equipment 

Motor 

Vehicles 

and 

Equipment 

Electrical 

!  Equipment 
and 

Supplies 

Ship 
and  Boat 
Building 

1952 

.59 

.62 

NAa 

.61 

.64 

.63 

1953 

.63 

.63 

NA3 

.64 

.67 

.68 

1954 

.  66 

.66 

NA° 

.66 

.69 

.68 

1955 

.69 

.68 

NAa 

.74 

.71 

.71 

1956 

.72 

.71 

NAa 

.75 

.75 

.75 

1957 

.75 

.75 

NAa 

.73 

.79 

.80 

1958 

.80 

.80 

.81 

.82 

.82 

.83 

1959 

.84 

.84 

.85 

.81 

.85 

.86 

1960 

.86 

.87 

.88 

.84 

.89 

.89 

1961 

.89 

.90 

.90 

.35 

.91 

.93 

1962 

.91 

.93 

.93 

.89 

.93 

.97 

1963 

.94 

*95 

l 

.94 

.93 

.95 

.98 

1964 

.98 

.96 

.  98 

.96 

.98 

1.00 

1965 

1.00 

1.00 

1.00 

1,00 

1.00 

1.00 

aNot  available  (for  years  1952-1957  it  Is  suggested  that  the  labor  price 
index  for  aircraft  be  used), 
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Finished  steel  .  .02 

Stainless  steel  sheet . 04 

Titanium  sponge  . 07 

Aluminum  sheet  .  .29 

Aluminum  rod  . 11 

Aluminum  extrusions  . .20 

Wire  and  Cable  . ,12 

Rivets  ,  etc . .15 


For  any  given  year  a  price  index  for  each  of  these  is  obtained  and  a 
composite  index  constructed  by  summing  the  individual  index  numbers 
multiplied  by  the  weightings,  e.g.: 


Commodity 

1967  Index 
Number^ 

Weight 

Index  Number 
Times  Weight 

Finished  steel 

105.8 

.02 

2.12 

Stainless  steel  sheet 

108.0 

.04 

43.2 

Titanium  sponge 

60.3 

.07 

4.22 

Aluminum  sheet 

99.8 

.29 

28.94 

Aluminum  rod 

110.4 

.11 

12.  14 

Aluminum  extrusions 

75.6 

.20 

15.  12 

Wire  and  cable 

126,0 

.  .12 

_  15.12 

Rivets,  etc. 

133  2 

,  15 

19.98 

Composite  index  number 

101.96 

*1957-  1959  <=  100. 


Weights  in  an  index  such  as  this  Hgeed  to  be  updated  from  time  to  time 
to  reflect  changing  technology,  and  it  may  be  that  those  shown  here, 
are  only  applicable  to  current  aircraft.  This  simple  example  is  in¬ 
cluded  only  to  illustrate  the  principle  of  deriving  a  composite  index; 
the  reader  who  wishes  to  pursue  the  matter  further  will  find  index 

if 

numbers  discussed  in  most  textbooks  on  economic  statistics.  Another 
type  of  composite  index  is  used  in  those  instances  where  labor  and 

* 

See,  for  example,  W.  A.  Spurr,  L.  S,  Kellogg,  and  J,  H,  Smith, 
Business  and  Economic  Statistics,  rev.  ed.  ,  Richard  D.  Irvin,  Inc., 
Homewood,  Illinois,  1961. 
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material  costs  cannot  be  separated  and  the  price-  ji  idjustment  has 
to  be  made  to  the  total  cost  of  an  engine,  airfr  missile,  etc. 

Such  an  index  can  be  derived  in  the  manner  il lustra  above  with  the 
labor  and  material  elements  weighted  according  to  what-ver  pattern  has 
been  found  to  exist  in  the  past,  e.g. ,  labor,  80  percent;  materials, 

20  percent.  - 

Overhead,  which  is  a  mixture  of  labor,  materials,  and  Items  such 
as  rent,  utilities,  taxes,  etc.,  in  most  cases  is  adjusted  by  the  same 
percentage  as  direct  labor.  To  decide  in  any  particular  r  se  whether  a 
different  adjustment  factor  should  be  used,  an  examination  of  each 
component  of  overhead-- indirect  labor,  fringe  benefits,  c  c. --would  be 
required.  This  cannot  be  done  by  reference  to  the  various  indexes 
published  by  BLS  and  other  governmental  agencies. 

Adjustment  of  costs  for  price  level  changes  is  not  always  as 
straightforward  as  the  foregoing  discussion  may  imply.  One  problem  is 
that  price  indexes  are  inherently  inexact  and  their  use,  while  neces¬ 
sary,  car.  introduce  errors  into  the  data.  The  average  hourly  earnings 
for  all  aircraft  production  workers  may  increase  by  $.05  in  a  given 
year  but  at  any  particular  company  they  will  increase  more  or  less  than 
that  amount.  Use  of  the  average  number  to  adjust  the  data  for  a  given 
company  will  bias  the  data  up  or  down.  Also,  for  many  specialized  items 
of -equipment ,  a  good  published  price'  index  doe^ not  exist.  In  fact, 
the  usual  indexes  are  oriented  toward  the  civilian  economy  and  may  be 
misleading,  i.e.,  they  may  understate  the  change  experienced  in  defense 
and  space  industries.  The  United  States,  along  with  many  other  countries, 
furnishes  the  Office  of  Economic  Cooperation  and  Development  (OECD)  in 
Paris  with  an  index  applicable  to  government  defense  expenditures  in 
general,  This  index,  shown  below  for  1952-1964,  is  useful  to  refer  to 
when  detailed  index  numbers  seem  questionable  or  are  nonexistent. 


Year 

Index 

Number 

Year 

Index 

Number 

1952 

84 

1959 

102 

1953 

83 

1960 

104 

1954 

84 

1961 

105 

1955 

88 

1962 

106 

1956 

93 

1963 

108 

1957 

97 

1964 

113 

1958 

100 
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Another  problem  is  that  of  identifying  the  years  in  which  expendi¬ 
tures  occur  when  the  only  date  available  show  total  contract  cost.  Pro 
duction  and  cash  flow  may  have  been  spread  out  over  a  period  of  several 
years,  and  in  principle  the  costs  should  be  adjusted  for  each  year 
separately.  Although  CIR  will  provide  the  information  needed  to  do 
this  in  the  future,  it  may  be  unavailable  today,  and  some  reasonable 
approximation  of  the  expenditure  pattern  must  suffice. 

One  method  of  doing  this  is  Co  use  a  percent-of-coat  versus  per- 
cent-of-time  curve  of  the  type  illustrated  in  Pig.  ir-2.  These  curves 
are  developed  from  historical  data  on  a  number  of  programs  involving 
the  same  kind  of  hardware--in  this  case,  large  ballistic  missiles--and 
can  be  used  to  break  total  research  and  development  or  total  production 
cost  into  annual  expenditures.  For  example,  to  determine  the  annual 
expenditures  in  a  five-year  R&D  program  amounting  to  a  total  of 

$50  million  the  following  percentages  would  be  obtained  from  the  R&D 
curve  of  Fig.  1 1- 2 : 

Time  Expenditures 

20  6.5 

40  23.0 

60  65.0 

80  92.0 

100  100.0 

These  percentages  are  cumulative,  of  course,  so  the  annual  percentages 
and  the  amount  they  represent  would  be: 


Year 

Expenditures 

Percent 

Dollars 
(mill ionsl 

1 

6.5 

3.25 

2 

16.5 

8.25 

3 

42.0 

21.00 

4 

27.0 

13.50 

5 

8.0 

4.00 

In  the  production  phase  a  technique  which  can  he  used  is  to  develop 
lag'  factors  by  examining  delivery  schedules  and  production  lead  times. 
Costs  are  then  lagged  behind  delivery  dates  by  some  reasonable  factor. 

A  more  fundamental  question  than  any  of  those  raised  above  is 
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whether  price- level  changes  should  be  made  at  all.  The  argument  is 
sometimes  made  that  the  upward  trend  in  wage  rates  has  been  accompanied 
by  a  parallel  trend  in  the  output  per  employee,  or  productivity  rate. 
This  implies  that  there  has  been  little  change  in  the  real. costs  of  aero 
space  equipment  since  increases  In  wages  and  materials  costa  have  been 
offset  by  a  decrease  in  the  number  of  employees  required  per  dollar  of 
output.  The  real  dollar  output  per  man  is  difficult  to  measure,  how¬ 
ever,  in  an  industry  where  continual  change  rather  than  standardization 
is  the  rule.  Certainly  the  growth  In  productivity  is  not  uniform  for 
aircraft,  missiles,  ships,  and  tanks,  and  to  develop  a  productivity 
index  for  each  would  be  a  difficult  and  contentious  task.  Present 
practice,  therefore,  is  to  apply  the  price-level  adjustment  factors  to 
obtain  constant  dollars  while  remaining  alert  to  any  obvious  inequities 
that  may  be  introduced  by  doing  so. 

Cost-Quantity  Adjustments 

Chapter  VI  of  this  volume  discusses  the  cost-quantity  relation¬ 
ship,  generally  known  in  the  aerospace  industry  as  the  learning  curve, 
at  some  length.  For  those  persons  unfamiliar  with  this  concept  it 
states  in  brief  that  each  time  the  total  quantity  of  items  produced 
doubles,  the  cost  per  item  is  reduced  to  some  constant  percentage  of 
its  previous  value.  Whether  one  accepts  this  particular  formulation 
or  not,  the  fact  is  that  for  most  production  processes  costs  are  in 
some  way  a  function  of  quantity:  as  the  number  of  items  produced 
increases,  cost  normally  decreases.  Thus,  in  speaking  of  cost  it  is 
essential  that  some  quantity  be  associated  with  that  cost.  An  equip¬ 
ment  item  can  be  said  to  cost  $100,000,  $80,000,  $64,000,  or  $51,200 
and  all  of  these  numbers  will  be  correct, 

Which  cost  should  be  used  by  the  cost  analyst?  The  answer  to  that 
question  will  depend  on  a  number  of  factors;  if  his  purpose  is  to  com¬ 
pare  one  missile  with  another  the  cumulative  quantity  must  be  the  same 
for  both  missiles.  The  adjustment  to  a  specific  quantity  can  be  made 
very  simply  if  the  slope  of  ths  learning  curve  is  known  or  can  be  in¬ 
ferred  from  the  data.  To  illustrate,  costs  for  three  missiles  are 
shown  below.  The  cost  is  the  same  for  each  item,  but  the  quantity  is 
different.  To  compare  the  costa  for  the  items,  they  must  be  adjusted 


Production  cost 
(thousands  of  dollars) 
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Missile 

Unit  Number 

Cost/Unit 

1 

50 

$1000 

2 

1C3 

1000 

3 

200 

1000 

to  a  common  quantity.  If  the  quantity  100  ia  chosen  and  an  80  percent 
learning  curve  assumed  for  all  three  missiles,  the  adjusted  costs  will 
be: 


Missile 

Unit  Number 

Cost/Unit 

l 

100 

$  800 

2 

100 

1000 

3 

100 

1250 

Projecting  labor  requirements  for  the  100th  unit  when  only  50  units 
have  been  produced  Is  somewhat  uncertain,  of  course,  but  ignoring  the 
cost-quantity  relationship  will  in  most  instances  result  in  greater 
error  than  such  a  projection  introduces. 

The  learning  curve  is  most  frequently  depicted  as  a  straight  line 
on  log  Log  paper  as  in  Fig.  II-3.  The  points  above  the  curve  illustrate 
a  point  made  earlier.  They  show  the  effect  of  adjusting  production 
costs  incurred  over  the  period  1954-1958  to  1965  dollars. 


Other  Possible  Cost  Adlustments 
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As  exemplified  earlier  by  the  mention  of  productivity  changes  over 
time  and  the  lack  of  a  way  to  adjust  cost  data  for  such  changes,  many 
more  kinds  of  adjustments  can  be  theorized  than  have  been  quantified, 

It  has  been  suggested,  for  example,  that  some  adjustment  may  be  required 
because  of  differences  in  contract  type--fixed  price,  fixed  price  in¬ 
centive,  cost  plus  fixed  fee,  etc, --or  differences  in  the  type  of  pro- 
curemcnt-Tompetltive  bidding  or  sole  source.  The  hypothesis  here  is 
that  the  type  of  contract  or  procurement  procedure  will  bias  costs  up 
or  down,  but  this  has  been  an  exceedingly  difficult  hypothesis  to 
substantiate. 

Another  suggestion  concerns  manufacturing  techniques.  What  are 
the  effects  of  varying  amounts  of  capital  investment  or  capital  improve¬ 
ment  and  of  changes  in  manufacturing  state  of  the  art?  A  related  ques¬ 
tion  concerns  the  efficiency  of  the  contractor.  We  may  suspect  that 
Contractor  A  has  been  a  lower  cost  producer  than  Contractor  B  on  simi¬ 
lar  items,  but  this  is  extremely  difficult  to  substantiate.  A  low-cost 
producer  may  be  one  who  because  of  his  geographical  location  pays  lower 
labor  rates.  Contractors  in  Fort  Worth,  Texas  and  Atlanta,  Georgia  may 
have  a  considerable  advantage  in  this  regard  over  their  competitors  in 
Los  Angeles,  San  Francisco  and  Seattle.  The  table  below  does  not  give 
a  fair  picture  of  comparative^ates  because  differences  between  lndus^ 
tries  in  the  various  cities  tend  to  be  more  Important  than  differences 
in  location.  But  it  can  be  seen  for  two  cities  as  close  together  as 
Los  Angeles  and  San  Francisco  that  labor  rates  differ  by  f.bout  10 
percent.  Thus  while  it  might  not  be  possible  to  adjust  cost  data  on 
the  basis  of  contractor  efficiency,  it  is  possible  to  make  adjustments 
for  differences  in  location  by  using  the  specific  area  labor  rates. 
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Table  II -4 

AVERAGE  HOURLY  EARNINGS  OF  PRODUCTION  WORKERS 
ON  MANUFACTURING  PAYROLLS --NOVEMBER  1965* 

Atlanta  .  $2.69 

Boston  .  2.69 

Chicago  . 2.91 

Detroit  . . 3.45 

Los  Angeles  . .  3.04 

New  Orleans  . 2.72 

New  York  . .  2.63 

Philadelphia  .  2.79 

St.  Louis  .  2.96 

San  Francisco  .  3.35 

Seattle  .  3.25 

*Fro«  Employment  and  Earnings.  Bureau  of 
Labor  Statistics,  January  1966. 
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III.  USING  STATISTICS  IK  THE 
DEVELOPMENT  0 T  ESTIMATING  RELATIONSHIPS 


As  stated  in  a  previous  chapter,  many,  perhaps  most,  estimating 
relationships  are  simple  statements  indicating  that  the  cost  of  some 
commodity  is  directly  proportional  to  the  weight,  area,  Volume  or  some 
other  physical  characteristic  of  that  commodity.  These  estimating  re¬ 
lationships  are  simple  averages- -very  useful  in  a  variety  of  situations 
but  because  of  their  simplicity  requiring  little  explanation  here.  Our 
concern  is  with  the  derivation  of  more  complex  relationships,  t.e., 
equations  that  describe  the  basic  data  better  than  a  simple  factor 
can  and  that  can  reflect  the  Influence  on  cost  of  more  than  one  vari¬ 
able.  The  intent  is  to  illustrate  a  general  approach  to  the  develop¬ 
ment  of  such  relationships  and  to  introduce  certain  basic  concepts  of 
statistical  analysis.  The  emphasis  is  not  on  statistics  per  se .  and 
the  basic  mathematical  statistical  theory  involved  as  well  as  the 
computational  aspects  of  regression  analysis  are  generally  ignored. 

This  chapter  merely  presents  some  of  the  statistical  considerations 
involved  in  developing  estimating  relationships  for  advanced  equipment 
estimating.  While  Statistical  procedures  are  stressed,  the  intent  is 
not  to  suggest  that  regression  analysis  offers  a  quick  and_.ea_sy ..s_o_l.ut.l_an 
to  all  the  problems  of  estimating  cost.  Statistical  analysis  can  help 
provide  an  understanding  of  factors  which  influence  cost,  but  estimat¬ 
ing  relationships  are  no  substitute  for  understanding. 

The  outstanding  characteristic  of  a  cost  factor  is  that  the  re¬ 
lationship  between  cost  and  the  explanatory  variable  is  direct  and  ob¬ 
vious;  thus,  cost  per  pound  is  widely  used  because  of  the  generally 
satisfying  thesis  that  as  a  ship,  tank,  or  aircraft  increases  In  weight 
it  becomes  more  costly.  Weight  changes  do  not  always  explain  cost 
changes,  however,  and  many  other  explanatory  variables  are  used.  The 
problem  is  to  find  these,  and  this  is  done  first  by  deciding  what  var¬ 
iables  are  logically  or  theoretically  related  to  cost  and  then  by  look¬ 
ing  for  patterns  in  the  data  that  suggest  a  relationship  between  cost 


-34- 


*nd  these  variables.  A  simple  srrsy,  ss  In  Tabls  IIE-1,  may  reveal 
such  patterns. 


Table  HI-1 

TEH  AIRBORNE  RADIO  COMHUNICATIOW  SETS 


Cost  ($) 

Weight  (lb) 

Power  Output  (w) 

Frequency  (ath) 

22,200 

90 

20 

400 

17,300 

161 

400 

30 

11,800 

40 

30 

400 

9,600 

108 

10 

400 

8,000 

82 

10 

400 

7,600 

135 

100 

25 

6,800 

59 

6 

400 

3,200 

68 

8 

156 

1,700 

25 

8 

42 

1,600 

24 

.5 

258 

In  this  table,  the  costa  of  10  airborne  radio  communications  sets 
are  given  along  with  the  weight,  power  output  and  frequency  of  each. 

A  priori -.-one  might  expect  cost  to  Increase  with  weight  or  with  power 
output.  Frequency  is  included  because,  historically,  higher  and  higher 
frequencies  have  been  sought  to  Increase  cotaaunl cat Ions  capacity,  and 
in  general  for  a  given  power  output  higher  frequency  sets  have  been 
more  costly. 

From  Table  III- 1  it  is  clear  that  cost  is  not  a  simple  linear  func¬ 
tion  of  any  of  the  three  possible  explanatory  variables  shown.  Cost 
tends  to  increase  with  weight,  but  there  are  notable  exceptions  to  the 
trend  as  shown  in  the  scatter  diagram  of  Fig.  III-U.  Cost  plotted 
against  power  output  (Fig.  Ill- lb)  is  even  less  promising,  partially 
because  of  the  scale  which  does  not  enable  an  observer  to  distinguish 
among  the  points  between  .5  and  30  watts.  Changing  from  an  arithmetic 
to  a  logarithmic  scale  (Fig.  III-2)  distinguishes  better  among  points 
in  the  low  power  range  and  indicates  that  a  trend  does  exist  but,  again, 
with  a  very  wide  scatter. 


Thousands  of  dollars  !  Thousands  of  dollars 
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Fig.  1 1 1  -  la —Cost  versus  weight 
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Fig.  1 1 1  -  lb— Cost  versus  power  output 
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Power  output  (w) 

Fig. ill  - 2 — Cost  versus  power  output  (logarithmic  grid) 


Ir  appears  chat  the  scatter  may  be  explained  to  some  extent  by 
the  effect  of  frequency  and  in  Fig.  III-3  each  point  is  identified 
to  a  frequency  class: 


HF  -  up  to  30  mh 
VHF  -  30  to  300  mh 
UHF  -  above  300  mh 

A  clearer  relationship  exists  between  cost  and  power  output  within  each 
frequency  class  than  would  seem  to  exist  for  the  whole  sample  scattered 
without  regard  to  frequency.  This  suggests  that  Che  sample  is  not 
homogeneous.  Each  frequency  band  may  constitute  a  separate  sample,  or 
possibly  HF  and  VHF  costs  sre  on  one  level  and  UHF  costs  on  another. 

With  a  larger  data  base  each  sample  could  be  examined  separately 
and  a  regression  line  drawn  for  each.  Given  a  maximum  of  five  points 
in  each  of  two  samples,  however,  regression  analysis  techniques  are  not 
warranted.  The  Justification  for  regression  analysis  (as  distinct  from 
simply  drawing  a  line  of  best  fit  through  the  points  either  by  a 


Cost  (thousands  of  dollars) 
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Powei  output  (  w ) 

Fig.  Ill-  3— Frequency  class  identified 


least-squares  or  freehand  technique)  Is  to  be  able  to  say  something 
about  the  reliability  of  the  regression  line;  in  this  case  statistical 
measures  of  reliability _.wq_uld__have  lit tie  .meaning- _ _  _  - 

At  this  point  it  is  not  clear  that  any  of  the  possible  explanatory 
variables,  either  singly  or  in  combination,  will  yield  a  useful  esti¬ 
mating  relationship.  But  as  a  means  of  illustrating  some  of  the  tech¬ 
niques  commonly  used  in  deriving  such  relationships,  let  us  begin  with 
the  assumption  that  cost  can  be  related  to  a  single  predictive  variable — 
wcight--and  examine  the  results  of  a  linear  normal  regression  model. 

In  a  later  example  we  shall  consider  several  variables  in  a  multiple 
regression  analysis. 

Regression  theory  has  become  a  widely  accepted  tool  for  cost  analysts 
and  is  often  used  to  develop  estimating  relationships.  In  simple  re¬ 
gression  analysis  we  are  interested  in  estimating  the  value  of  one 
variable  based  on  its  relationship  to  a  second  variable.  Regression 
theory  provides  a  means  for  examining  whether  a  relationship  exists; 
and  when  it  does,  for  measuring  the  nature  and  extent  of  the  relation¬ 
ship. 
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According  to  classical  statistics  a  population  (or  universe) 
defines  the  totality  of  all  pertinent  values  that  any  variable  or 
variables  can  achieve.  It  follows  that  the  true  relationship  between 
two  variables  must  be  embodied  within  a  population.  (It  is  seldom 
known,  however,  whether  the  set  of  values  available  i.n  any  given  prob¬ 
lem  constitutes  a  population  or  is  only  a  subset  (sample)  of  a  larger 
population.  Generally,  these  values  are  considered  to  be  a  sample 
which  can  be  used  to  estimate  relationships  for  an  actual  population.) 

The  form  of  the  regression  function  depends,  of  course,  upon  the 
problem.  It  may  reflect  an  underlying  physical  Law  or  perhaps  some 
other  structural  relationship,  When  to  particular  functional  form  is 
suspected,  the  simple  linear-regression  model  is  irequently  used  to 
describe  the  relationship  between  two  variables.  The  equation  of  this 
model  is: 

y  =  a  +  bx 

Where  y  is  the  dependent  variable  and  x  the  independent  variable. 

The  symbols  a  and  b  are  parameters  or  constants  whose  values  are  to 
be  calculated  from  the  data.  Here  y  could  be  the  cost  of  a  radio  com¬ 
munication  set  and  x  the  weight.  The  model  then  indicates  that  heav¬ 
ier  equipment  will  cost  more  than  lighter  equipment.  The  values  of 
a,  b  and  x  allow  a  computation  of  a  value  fo»-  the  cost  for  any  equip- 
m  '  if  we  know  its  weight. 

To  make  statistical  predictions,  certain  assumptions  must  be  made 
about  this  model.  The  classical  requirement  is  that  x  values  are  fix¬ 
ed  and  y  values  are  random  variables  for  given  x  values.  This  is 
graphically  illustrated  in  Fig.  III-4.  Specifically,  for  the  popula¬ 
tion  it  is  assumed  that  (1)  the  variance  of  y-values  about  the  regression 
line  is  the  same  for  all  x-values  (x^  ,  x^ ,  x^ ,  x^,  etc.)  and  (2)  y- 
values  for  a  given  x  value  are  normally  distributed  about  the  regression 

line.  For  the  sampLe  it  is  assumed  that  y-values  are  simple  random 

★ 

samples  taken  from  the  total  population. 

* 

For  a  more  complete  sf.ai'-  :nt  of  the  assumptions  about  the  sample 
see  W,  A.  Spurr  and  C.  P.  Bonint,  Statistical  Analysis  for  Business 
Dec  is  ions .  Richard  D.  Irwin,  Inc.,  1967,  pp.  564-565. 
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Xi  x2  x3  x4 

Independent  variable  -  X 


Fig.  I ! I -4— Simple  linear  population  regression  model 


Given  the  regression  model  shown  above,  the  basic  problem  is  to 
derive  estimates  of  the  parameters  a  and  b  such  that  the  regression 
equation  will  approximate  the  sample  data  as  closely  as  possible.  One 

it 

procedure  for  doing  this  uses  the  method  of  maximum  likelihood.  In 
normal  linear  regression  it  turns  out  that  the  maximum  likelihood 
method  is  exactLy  equivalent  to  a  least-squares  procedure.  The  values 
of  a  and  b  are  determined  by  the  requirement  that  the  sum  of  the  square 
of  the  deviations  of  the  sample  observations  from  the  regression  line 
will  be  at  a  minimum.  The  two  normal  equations  for  linear  regression 
are: 

Ey  =  na  +  b£x 

2 

Eyx  ■  a£x  +  bEx 


The  principle  of  maximum  likelihood  is  discussed  in  Introduction 
a  Theory  of  Statistics  by  A.  F.  Mood,  McGraw-Hill,  1950,  pp.  152- 


154. 
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In  this  example; 

y  -  cost  of  airborne  radio  equipment  (in  thousands  of  dollars) 
x  «=  weight  of  airborne  radio  equipment  (in  pounds) 
n  =  number  of  items  in  sample 
Z  =  sum  of  (e.g.,  £y  =  the  sum  of  all  y's) 

Table  III-2  shows  the  relevant  numerical  values  to  be  substituted  in 
the  above  equations.  They  are: 

n  -  10 
Ey  *  90.6 
Ex  =  792 
Eyx  =  8739.4 
Ex2  =  81,540 

Substituting  these  numbers  in  the  normal  equations,  we  obtain: 

90.6  =  10a  +  792b 
8739.4  =  792a  +  81,540b 


Table  III-2 


DATA  FOR  REGRESSION  ANALYSIS  OF  GOST  AND  WEIGHT 


X 

Y 

x2 

Y2 

XY 

90 

22.2 

8,100 

492.84 

1998.0 

161 

17.3 

25,921 

299.29 

2785.3 

40 

11.8 

l  ,600 

139.24 

472.0 

108 

9.6 

11,664 

92.16 

1036.8 

82 

8.8 

6,724 

77.44 

721.6 

135 

7.6 

18,225 

57.  76 

1026.0 

59 

6.8 

3,481 

46.24 

401.2 

68 

3.2 

4,624 

10.24 

217.6 

25 

1.7 

625 

2.89 

42.5 

24 

1,6 

576 

2.56 

38.4 

792 

90.6 

81 ,540 

1220.66 

8739.4 
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Solving  these  simultaneously  gives: 

a  -  2.477 
b  -  .083 

Or: 

y  -  2.477  +  . 083x 

The  regression  line  represented  by  the  equation  is  shown  in  Fig. 
I1I-5  as  the  solid  line.  Its  usefulness  for  predictive  purposes  de¬ 
pends  on  the  extent  of  the  dispersion  of  the  observations  about  it-- 
the  greater  the  dispersion  of  observed  values  of  y  about  the  line,  the 
less  accurate  estimates  based  on  the  line  are  likely  to  be.  The  mea¬ 
sure  of  the  dispersion  of  the  actual  observations  is  the  standard  error 
of  estimate  (S)  of  the  regression  equation. 


Fig. Ill -5— Regression  line  and  standard  error  of  estimate 
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The  standard  error  of  estimate  is  defined  as  the  square  root  of  the 
unexplained  variance  of  the  y's  in  the  sample.  This  unexplained  vari¬ 
ance  is  derived  from  the  difference  between  the  observed  y  values 
(from  Table  1II-1)  and  the  computed  y  values  (computed  from  the  re¬ 
gression  equation).  This  is  illustrated  in  Fig.  III-6. 


Fig. Ill -6— Unexplained  and  explained  variance 

Explained  variance,  which  we  will  deal  with  later,  is  derived  from  the 
difference  between  the  computed  y  values  and  the  mean  of  the  observed 
values.  Total  variance  is  the  sum  of  the  two. 

Expressed  mathematically,  unexplained  variance  is: 

2  £(y  -  yc)2 

a. .  =  - ~ - 


Thus,  the  unadjusted  standard  error  of  estimate  is  the  square  root  of 
this  expression,  or: 
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S 


To  compensate  for  the  fact  that  standard  errors  calculated  for 
small  samples  typically  understate  the  dispersion  in  the  population, 
an  adjustment  is  required.  The  adjusted  standard  error  of  estimate  fs) 
is  oDtained  by  subtracting  the  number  of  parameters  in  the  regression 
equation  from  the  sample  size  (n)  in  the  formula  for  S.  In  this  case 
the  number  of  parameters  is  two  (a  and  b).  Therefore  the  formula  for 
S  is: 


S 


fUy  -  yc> 

n  -  2 


From  this  it  is  clear  that  for  large  sample  sizes  the  adjustment  is 
of  no  importance.  In  small  »amples--particularly  very  small  samples 
such  as  we  are  dealing  with  here--the  adjustment  can  make  quite  a 
difference. 

The  standard  error  of  estimate  for  the  estimating  equation 
y  =  2.477  +  . 083x  is  $5,800  and  in  Fig.  III-5  a  band  of  +  S  from  the 
regression  line  has  been  plotted. — In  interpreting  the  standard  error 
of  estimate  the  main  point  is  that  in  normal  linear  regression  analyses 
one  might  expect  about  two-thirds  of  the  sample  observations  to  fall 
within  a  region  bounded  by  +  S  from  the  regression  line.  Virtually 
all  observations  should  fall  within  +  3  S,  In  practice  these  gener¬ 
alizations  do  not  tend  to  hold  up  very  welL  in  very  small  sample  cases. 

For  some  purposes--particularly  in  comparing  one  S  with  another-- 
it  is  useful  to  compute  a  relative  standard  error  of  estimate.  One 
such  measure  is  the  coefficient  of  variation  (C)  ,  which  relates  the 
standard  error  of  estimate  to  the  mean  of  the  sample  y's: 


C 
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In  the  example  the  mean  of  the  y's  is  $9,060,  The  value  of  C,  there¬ 
fore  ,  is : 


$5.800 

$9,060 


,64 


which  is  quite  high.  While  the  question  of  reliability  of  an  estimating 
equation  is  a  relative  matter,  that  is,  it  is  relative  to  the  context 
in  which  the  equation  is  to  be  used,  something  like  10  to  20  percent 
would  be  more  desirable. 

The  standard  error  of  estimate  and  the  coefficient  of  variation 
indicate  how  well  the  regression  equation  describes  the  sample  obser¬ 
vation,  but  this  is  rarely  the  area  of  greatest  interest.  The  analyst 
is  usually  more  interested  in  using  the  estimating  equation  to  predict 
coats  in  the  population  or  universe  of  items  that  the  sample  supposedly 
represents,  and  the  standard  error  of  estimate  does  not  furnish  a  good 
measure  of  the  reliability  of  the  regression  equation  for  predictive 
purposes.  The  subject  of  reliability  raises  several  additional  con¬ 
siderations.  First, is  the  question  of  whether  x  and  y  are  actually 
related  in  the  manner  indicated  by  the  regression  equation.  A  partlc- 
__ ular  sample  could  show  such  a  relationship  out  of  pure  chance  when  in 
fact  none  exists.  Second,  the  regression  equation  obtained  from  the 
sample  is  only  one  of  a  family  that  could  be  obtained  from  different 
samples  within  the  same  population.  This  means  that  the  predicted  y 
may  not  be  the  true  y.  Both  questions  are  dealt  with  by  statistical 
inference,  the  first  by  a  test  of  statistical  significance  and  the 
second  by  establishing  a  prediction  interval  for  the  regression  line. 

While  the  subject  of  statistical  testing  is  too  complex  to  treat 
in  any  detail  here,  basically  what  is  involved  is  to  set  up  the  hypothesis 
that  x  and  y  are  not  related  (the  null  hypothesis),  and  then  let  the 
testing  procedure  indicate  whether  the  hypothesis  is  accepted  or  re¬ 
jected  at  some  specified  level  of  probability.  The  particular  test 
to  be  used  here  Is  commonly  known  as  the  t-test  because  it  uses  the  t- 
ratio,  or  ratio  of  a  coefficient  to  its  standard  error.  This  ratio  is 
expressed: 
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where  b  *  the  regression  coefficient  (from  the  linear  regression  model 
y  *  a  +  bx) 

s,  =  the  standard  error  of  b 
o 

The  value  obtained  for  tfa  is  1.96,  and  this  is  interpreted  by  reference 
to  a  table  of  t-values.  The  relevant  row  from  such  a  table  is  shown 
below. 


Degrees  of 

Level 

of  Significance 

(or  Probability) 

Freedom 

.20 

.10.. 

.05 

.02 

.01 

8 

1.397 

1.860 

2.306 

2.896 

3.355 

Note  that  the  first  column  is  headed  "Degrees  of  Freedom"  instead  of 
n,  the  number  of  items  In  the  sample.  In  a  regression  analysis  the 
term  "degrees  of  freedom"  means  the  sample  size  minus  the  number  of 
parameters  (values  to  be  estimated,  i.e.,  a  and  b)  in  the  regression 
equation,  or  in  this  case,  10  -  2  =  8.  The  value  of  1.96  is  seen  to 
lie  between  the  .1  and  .05  levels  of  significance.  This  means  that 
the  chances  are  between  5  and  10  percent  that  a  sample  taken  from  a 
population  in  which  x  and  y  have  zero  correlation  could  have  a  t  as 
-high-  aa--l-;-967'-Henrt';-,i'f-we "e'sEatl'Isir the  : required  level  of  probability 
at  10  percent,  the  hypothesis  that  there  is  no  correlation  in  the  popu¬ 
lation  is  rejected.  Oil  the  other  hand  if  a  .05  level  of  significance 
seems  appropriate,  the  hypothesis  must  be  accepted. 

A  reasonable  question  at  this  point  is:  What  should  be  the  level 
of  probability  for  accepting  or  rejecting  the  hypothesis?  Unfortunately, 
no  simple  answer  is  possible.  The  10,  5,  and  1  percent  values  are 
probably  most  commonly  used,  but  the  analyst  must  make  his  own  judgment 
based  on  the  risk  assumed  by  rejecting  a  true  hypothesis  (a  Type  I 
error)  or  accepting  a  false  hypothesis  (a  Type  II  error).**  For  our 


All  the  references  at  the  end  of  the  chapter  contain  t- tables. 

rt* 

For  a  good  discussion  of  this  see  Business  and  Economic  Statistics 
by  W.  A.  Spurr ,  L.  S.  Kellogg  and  J.  H.  Smith,  Richard  D.  Irwin,  Inc., 
1961,  pp.  251-255. 
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purpose  here  we  will  accept  a  10  percent  value  both  here  and  in  estab¬ 
lishing  a  confidence  or  prediction  interval  for  the  regression  line. 
The  procedure  for  that  ia  as  follows; 

For  a  given  value  of  the  explanatory  variable,  say  x,  the  esti¬ 
mating  equation  is  used  to  obtain  a  predicted  value  of  the:  dependent 
variable: 

y  *  a  +  bx 

Then  we  can  put  a  boundary  around  y,  say  y  ±  A--such  that  there  is  a 
certain  level  of  confidence  that  the  established  interval  does  indeed 
bracket  the  true  value  of  y  in  the  population, 

In  the  case  of  normal  linear  regression,  a  100(1  -  e)  percent 
prediction  interval  for  an  estimated  value  of  the  dependent  variable 
can  be  constructed  as  follows: 


y  ±  A, 


where 


Ste 


In  + 
jJ  n 


and: 


S  ■  standard  error  of  the  estimating  equation  from  which  y  was 
obtained, 

t  *  the  value  obtained  from  a  table  of  t-values  for  the  e  sig- 
€ 

nlflcance  level, 


n  -  size  of  the  sample, 


x  -  the  specified  value  of  the  explanatory  variable  used  as  a 
basis  for  obtaining  y, 


x 

E(x  -  x)2 


the  mean  of  the  x's  in  the  sample, 

the  sum  of  squared  deviations  of  the  sample  x's  from  their 


mean. 
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Using  the  estimating  equation  derived  previously,  the  cost  of  a 
communications  set  weighing  100  lb  is  estimated  to  be  $10,777,  To  es¬ 
tablish  a  90  percent  prediction  interval  around  this  value  the  necessary 
data  are: 

S  -  $5,800 

e  «0.1  (Since  a  ,90  prediction  interval  iB  to  be  computed, 

1  *  e  «  ,9  or  «  *  ,1) 

n  -  10 
x  =  100  lb 
x  -  79.2  lb 
^(x  -  x)2  »  18,893  lb 

Substituting  in  the  above  equation  and  solving  for  A  gives: 

A  *  $12,380 

Therefore,  for  x  *  100  lb,  the  90  percent  prediction  interval  is: 

_ y_  t_S  , 1 U  $12,380 -  - 

This  means  that  when  all  the  underlying  assumptions  about  the  sample 
are  met,  we  have  a  subjective  confidence  of  90  percent  that  this  inter¬ 
val  brackets  the  true  or  population  value  of  y  when  x  =  LOO.  It  should 
be  emphasized  that  a  90  percent  prediction  interval  does  not  mean  that 
the  probability  is  0,90  that  the  true  value  of  y  lies  within  the  inter¬ 
val.  Rather,  it  means  that  if  we  were  to  repeat  the  prediction  pro¬ 
cedure  a  number  of  times,  we  would  expect  that  90  percent  of  the  time 
our  prediction  intervals  would  Include  the  true  value  of  y.  The  point 
is  that  the  true  value  of  y,  while  unknown  to  us,  is  a  constant  and 
not  a  random  variable  that  could  take  on  many  values.  Therefore,  the 
relevant  probability  concept  is  that  90  percent  of  the  intervals 
computed  as  this  one  has  been  will  include  the  true  value  of  y.  This 
statement,  of  course,  depends  on  the  assumptions  depicted  in  Fig.  III-4, 
p.  39. 


Using  the  prediction  Interval  procedure  outlined  above,  we  can 
compute  90  percent  prediction  intervals  for  other  values  of  x  and  plot 
these  numbers  to  obtain  a  90  percent  confidence  band  around  the  re¬ 
gression  line  as  in  Fig.  III-7,  In  this  case  it  is  clear  from  the 
figure  that  the  90  percent  confidence  region  is  fairly  wide,  reflecting 
graphically  a  measure  of  the  uncertainty  associated  with  the  estimating 
equation.  This  is  typical  of  analyses  based  on  small  samples,  The 
equation  for  the  prediction  interval  is  constructed  so  that  the  width 
of  the  interval  is  quite  sensitive  to  variation  in  sample  size  when  n 
is  small.  Sensitivity  to  small  values  of  n  is  logical,  since  general¬ 
izations  based  on  very  small  samples  should  be  subject  to  greater  un¬ 
certainty  than  those  founded  on  a  larger  data  base. 


Weigh?  (lb) 

Fig.  1 1 1  -  7— Ninety  percent  prediction  interval 
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It.  should  also  be  noted  that  the  prediction  interval  becomes  wider 
as  values  of  x  farther  from  the  mean  value  and  the  sample  are  selected, 
Thus,  for  example  the  prediction  interval  for  the  mean  (79  lb)  is 
$9,300  •>  $12,500,  while  for  x  =  200  lb  it  is  $19,000  *  $15,990.  The 
width  of  the  interval  in  the  latter  case  is  about  1.3  times  the  width 
for  the  mean  weight.  This  illustrates  in  a  rough  way  how  our  confidence 
in  the  estimate  decreases  as  we  extrapolate  beyond  the  range  of  the 
sample  data- -something  that  we  often  do  in  estimating  the  cost  of  advanced 
equipment . 

The  width  of  the  prediction  interval  is  also  sensitive  to  the  level 
of  confidence  specified.  Up  to  now  that  level  has  been  set  at  90  per¬ 
cent  (l.e.,  e  *  .1).  Suppose  that  only  a  70  percent  level  of  confidence 
is  desired  (e  =  0.3).  The  only  thing  that  changes  in  the  inputs  used  in 
the  previous  calculations  is  the  value  of  t,  Before,  we  used  t  ^  *  1.86; 
now  we  use  t  ^  *  1.108.  This  will  make  quite  a  difference  in  the  width 
of  the  prediction  interval.  Since  our  confidence  is  lower,  the  pre¬ 
diction  intervat  can  be  narrower,  and  for  lower  levels  of  confidence, 
the  band  would  be  even  narrower.  However,  except  for  very  low  levels 
of  confidence  the  interval  obtained  by  the  prediction  interval  procedure 

will  always  be  wider,  than  an  interval  established  on  the  basis  of  the 

_ * 

standard  error  of  estimate  atone. 

Up  to  this  point  the  discussion  has  been  confined  largely  to  sta¬ 
tistical  regression  analyses--developing  an  estimating  equation  and 
various  measures  of  uncertainty  pertaining  to  that  equation.  From  an 
estimating  point  of  view,  this  indeed  is  the  most  important  part  of  the 
analysis.  There  is,  however,  another  form  of  statistical  analysis  called 
correlation  analysis.  Correlation  analysis  is  concerned  with  develop¬ 
ing  an  abstract  measure  of  the  degree  of  association  between  the  dependent 
variable  and  the  explanatory  variable  or  variables.  In  simple  linear 
regression  the  most  commonly  used  measure  of  degree  of  association  is 
the  correlation  coefficient  (r).  The  coefficient  r  is  constructed  in 
such  a  way  that  it  is  bounded  by  the  interval  *  1.  The  sign  indicates 


But  recall  the  point  made  previously:  S  can  only  be  used  to 
measure  variations  of  y  in  the  sample,  not  for  describing  the  uncertainty 
of  a  pred ic ted  y. 


whether  the  slope  of  the  regression  line  is  positive  or  negative-- i. e. , 
whether  the  regression  coefficient  b  is  positive  or  negative.  At  the 
boundaries  of  the  interval  for  r  we  have  the  cases  of  perfect  correla¬ 
tion:  r  *  +1  (perfect  positive  correlation);  r  «  -1  (perfect  negative 

correlation).  In  these  instances  all  of  the  sample  points  would  lie 
exactly  on  the  regression  line.  When  there  is  no  correlation  between 
the  variables  whatsoever,  r  =  0, 

While  correlation  is  a  somewhat  different  type  of  analysis  from 

that  discussed  previously,  it  is  nevertheless  related  in  a  definite  way 

to  regression  analysis,  To  see  this  let  us  return  to  the  concepts  of 

total  variance,  explained  variance,  and  unexplained  variance  referred 

to  earlier  in  the  discussion  of  the  standard  error  of  estimate  and 

2 

illustrated  in  Fig.  III-6.  Total  variance  (o^.)  pertains  to  the  de¬ 
viations  of  the  v  values  in  the  sample  from  their  mean,  and  is  meas¬ 
ured  by: 

a2 . 

t  n 

2  _ 

Explained  variance  (cr  )  refers  to  the  deviations  from  y  of  the  computed 
y  values  (calculated  from  the  regression  equation)  corresponding  to  the 
values  of  the  independent  variable  x  in  the  sample,  and  is  measured  by: 


e  n 


As  explained  previously,  the  standard  error  of  estimate  (unadjust¬ 
ed)  is  the  square  root  of  the  unexplained  variance.  The  coefficient  of 
correlation  (r),  on  the  other  hand,  is  related  to  the  explained  variance. 
It  is  defined  as  the  square  root  of  the  proportion  of  total  variance 
that  is  represented  by  the  expla^  variance.  That  is: 


r 


W yc  -  y) 

?(y  ■  7) 2 


2 


r  is  sometimes  referred  to  as  the  coefficient  of  determination. 


We  now  see  Che  interrelationship  among  r,  S,  and  the  regression 
equation.  The  regression  equation  is  used  to  determine  the  computed 
y's,  which  are  inputs  to  the  calculation  of  both  r  and  S,  Also,  since 
r  is  defined  as  a  proportion  of  total  variance,  r  and  S  in  a  icnse  have 
an  inverse  relationship  to  one  another. 

Just  as  S  had  to  ue  adjusted  for  sample  s i2e- -par t icu lar ly  so  in 
the  case  of  small  samples--r  should  also  be  corrected,  The  value  of 
r  corrected  for  sample  size  is  as  follows : 


As  is  obvious  from  this  equation,  the  effect  of  the  correction  dampens 
out  as  n  becomes  large.  For  very  small  samples  the  correction  should 
most  certainly  be  made. 

The  correlation  coefficient  adjusted  for  sample  size  in  our  il¬ 
lustrative  example  is  .A8.  This  is  quite  low  and  tends  to  substantiate 
the  evidence  already  seen  that  weight  alone  is  not  a  good  predictor  of 
the  cost  of  airborne  radio  communication  equipment.  However,  it  should 
be  kept  in  mind  that  a  high  correlation  coefficient,  say  .95,  can  be 
misleading.  Mere  correlation  does  not  allow  an  analyst  to  infer  a  cause- 
and-effect  relationship  between  x  and  y.  Spurious  correlations  are 
cormnon.  For  example,  the  number  of  bathtubs  in  the  United  States  has 
been  Increasing  steadily  and  so  has  the  crime  rate  as  reported  by  the 
FBI.  One  might  very  well  find  a  statistical  correlation  between  the 
two  much  better  than  that  found  between  cost  and  weight  in  the  above 
sample.  Another  point  is  that  the  coefficient  of  correlation  may  be 
high  but  the  reliability  of  an  estimating  equation  as  measured  by  the 
standard  error  of  estimate  may  be  low.  The  explanation  hinges  on  the 
fact  that  r  is  based  on  a  ratio  while  S  Is  based  on  an  absolute  quantity: 


Thus,  even  If  the  explained  variance  represents  ■>  high  fraction  of  the 
total  variance,  it  is  still  possible  for  the  unexplained  variance  to 
be  large, 

CURVILINEAR  ANALYSIS;  LOGARITHMIC  REGRESSION 

Up  to  this  point  the  analysis  has  been  confined  to  simple  linear 
regression,  While  a  first  examination  of  the  scatter  diagram  of  cost 
vs  weight  indicates  that  a  linear  relationship  might  be  as  good  as  any¬ 
thing  else,  it  stilL  cannot  be  concluded  definitely  that  some  type  of 
non-linear  relationship  might  not  be  preferable.  Several  such  relation¬ 
ships  can  be  tried.  One  that  is  very  frequently  used,  and  that  we 
will  be  dealing  with  in  discussing  cost-quantity  relationships  in 
Chapter  V,  is  of  the  form: 

b 

v  =  ax 

Since  this  equation  is  difficult  to  deal  with  statistically,  usually 
we  make  a  logarithmic  transformation  of  the  variables,  obtaining  an 
equation  which  is  linear  in  the  logarithms  of  the  variables: 

Log  y  *  log  a  +  b(log  x) 

The  procedure  here  is  to  conduct  the  statistical  analysis  In  terms  of 
the  logarithms  of  the  variables,  that  is,  obtaining  estimates  of  log  a 
and  b  from  a  least  squares  fit  of  this  equation.  This  approach  has 
several  advantages  over  dealing  directly  with  y  »  ax  ,  the  most 
important  ones  being: 

1.  We  can  proceed  almost  identically  to  the  simple  linear  regression 
case . 

2,  No  additional  degrees  of  freedom  are  lost--an  important  con¬ 
sideration  when  the  sample  size  is  small. 

The  first  step  is  to  take  the  original  data  for  y  and  x  contained 
in  Table  111*1  and  convert  these  data  to  logarithms,  The  next  step  is  a 
simple  linear  regression  analysis  of  the  data  in  'ogarithmic  form.  This 


Cost  (thousands  of  dollars) 


moans  (.hat  a  linear  regression  equation  is  derived  such  that  the  sum  of 
the  squares  of  the  logarithms  of  the  variables  around  the  regression 
line  is  at  a  minimum,  Solving  as  before,  the  estimates  of  log  a  and  b 
are  found  and  the  regression  equation  for  the  logarithms  of  the  vari¬ 
able  is : 

log  y  «  -1.0425  +  1,0241  log  x 

This  equation  is  shown  as  a  solid  line  on  the  scatter  diagram  in  Fig. 
III-8.  Note  that  here  the  original  values  (arithmetic  form)  of  x  and  y 


1,0425  +  1.0241  log  x 


Weight  (lb) 

Fig.  1 1 1  -  8 — Logarithmic  regression 
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art'  plotted  on  i  chart  having  logarithms  scales  on  both  axes.  This 
is  exactly  equivalent  to  plotting  the  logarithms  of  the  variables  on 
an  arithmetic  chart,  Note  also  that  the  regression  line  slopes  upward. 
This  is  because  the  b*value  is  greater  than  one,  With  a  b-vaiue  of 
less  than  one,  the  curve  would  slope  down. 

The  standard  error  of  estimate  is  computed  as  before  but  in  Log 
terms ; 


log 


.2  763 


In  Fig,  III-8  the  dashed  lines  indicate  a  band  representing  +  S 

^  -  tog 

around  the  regression  line. 

For  perspective,  the  value  may  be  related  to  the  mean  of  the 

Log  y's  in  the  sample  to  obtain  the  coefficient  of  variation  for  the 
log  equation,  The  procedure  is  the  same  as  that  shown  on  P.  40. 


log 


Eloft  V 
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.335 


At  this  point  It  would  appear  that  things  have  improved  somewhat  over 
the  simple  linear  regression  case.  The  picture  portrayed  in  Fig.  III-8 
suggests  a  better  fit  to  the  data,  Also,  the  standard  error  of  esti¬ 
mate  in  relation  to  the  mean  of  the  log  y's  is  substantially  Lower  than 
in  the  simple  linear  regression  example:  34  percent  as  compared  with 
64  percent. 

But  this  is  not  the  whole  story,  since  up  to  now  the  analysis  has 
dealt  with  the  logarithms  of  the  data,  and  the  analyst  is  interested  in 
making  estimates  in  terms  of  the  original  data.  We  therefore  have  to 
transform  the  Logarithmic  analysis  back  to  an  arithmetic  form.  When 
this  transformation  is  made,  the  estimating  equation  becomes: 

y  -  . 09056 (x 1 ' 0241 ) 


where  ,i)4i'r.)6  is  tin?  anti-log  of  log  a  *  -1,1)4..  >.  This  equation  is 
plotted  on  the  scatter  diagram  contained  In  Fig.  HI-1),  It  should  be 
noted  that  the  equation  plots  as  a  straight,  line  over  the  range  of 
weights  shown.  Since  the  exponent  of  x  is  close  to  unity,  the  curvi¬ 
linear  Lty  implied  by  the  form  of  the  equation  does  not  show  on.  Note 
also  that  the  regression  Line  does  not  appear  to  be  a  particularly  good 
fit  to  the  original  data--no  better  than  the  simple  linear  estimating 
equation  obtained  previously. 


.  09056 (x 
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Fig.  1 1 1 -9— Cost  versus  weight  on  arithmetic  grid 
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To  gain  flirt  ho  r  insight,  let  u*.  turn  t>.  >  Liu.'  standard  irrur  of 
estimate  and  compute  .1  +  1  S  band  about  the  regression  line  This 
band  is  illustrated  by  the  dashed  Lines  in  Fig,  lil-V,  We  now  have  a 
much  different  picture  than  that  indicated  in  Fig.  ttt-3  for  the  loga¬ 
rithmic  analysis.  In  Fig,  1 1 1 - 9  the  S  interval  is  an  ever -w iderting  one 
defined  in  terms  of  linear  homogeneous  functions  of  x.  Recall  that  in 
our  simple  linear  regression  analysis  S  ~  $5,800.  If  we  lay  off  +  $5,800 
around  the  regression  Line,  the  results  are  the  dotted  lines  in  Fig.  I1I-9. 
We  conclude,  therefore,  that  in  this  case  the  logarithmic  regression 
offers  no  improvement  over  the  linear  regression. 

The  situation  portrayed  in  Fig.  III-9  has  sometimes  led  to  the 
suggestion  that  the  curvilinear  equation  be  used  for  small  values  of  x 
(because  the  standard  error  of  estimate  is  small)  and  the  linear  equation 
for  large  values.  It  is  important  to  keep  in  mind  that  the  difference 
between  the  two  standard  errors  of  estimate  in  Fig.  III-9  stems  from 
different  basic  assumptions  about  the  variance  of  y-values  about  the 
regression  line,  not  from  any  change  in  the  real  distribution  of  the 
variance.  In  the  Linear  case,  as  pointed  out  previously,  it  is  assumed 
that  the  variance  of  the  y-values  about  the  regression  Line  is  constant. 

In  the  curvilinear  case  the  variance  is  still  constant,  but  it  Is  con¬ 
stant  in  logarithmic  terms,  which  means  that  ii  actually  increases  with 
Che  magnitude  of  the  dependent  variable. 

The  logarithmic  example  contained  in  this  section  illustrates  n 
point  that  is  often  forgotten.  A  logarithmic  transformation  of  the 
variables  has  a  tendency  to  compress  and  shape  the  original  data  in 
such  a  way  that  a  statistical  fLt  to  the  logarithms  looks  good.  Very 
often,  however,  when  the  logarithmic  analysis  is  transformed  back  into 
terms  of  the  original  data,  the  results  do  not  appear  so  impressive.  In 
sum,  logarithmic  transformations  can  be  tricky  and  misleading.  The 
analyst  must  be  cautious  when  using  them. 

CURVILINEAR  ANALYSIS:  SECOND-DEGREE  EQUATION 

We  have  Just  seen  that  for  our  illustrative  example  .1  logarithmic 
regression  does  not  seem  to  offer  any  improvement  over  the  simple  linear 


-58- 


The  st.ind.ird  error  of  estimate  is  calculated  as  before,  except  here 

2 

vc  must  add  a  term  fot'  x  and  take  into  account  the  loss  of  the  addi¬ 
tional  degree  of  freedom.  Tlte  result  is  that  J>  is  greater  than  that 
obtained  for  the  linear  regression  squat  ion- - $6 , 240  vs  $5,500,  Art  area 
bounded  by  t  1  S  around  the  regression  line  is  presented  in  Fig,  III- in, 
Relating  S  Co  the  mean  of  the  sample  y's  gives  a  coefficient  of 
var  i .1 1  ion  of  : 


S  $0.240 
-  ’  $9,069 

y 
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Should  it  be  desired,  a  prediction  inceival  mry  be  calculated  for 

a  value  of  y  obtained  from  the  estimating  equation  for  specified  value9 
2 

of  x  and  x  ,  but  for  a  second-degree  regression  the  calculation  is  some¬ 
what  complicated  and  in  the  present  example  is  unlikely  to  add  anything 
to  the  analysis. 

Insofar  as  measures  of  correlation  are  concerned,  in  curvilinear 

analysis  the  coefficient  of  curvilinear  correlation  is  usualLy  referred 

to  as  the  index  of  correlation  and  is  denoted  by  the  symbol  r.  0  is 

called  the  index  of  determination  and  in  this  example  U  equal  to  .37, 

To  adjust  this  for  degrees  of  freedom  the  following  formula  may  be 
•* 

used: 

2  .  P2(n  -  1)  -  £i!L-l— LI 

n  -  m 

where  a.  :s  the  number  of  coefficients  in  U.e  regression  equation  (m  -  3 
in  the  case  of  second-degree  regression). 

Comparing  the  results  of  the  statistical  analysis  for  the  second- 
degree  regression  case  with  those  obtained  for  the  simple  linear  re¬ 
gression  example  suggests  that  the  second-degree  regression  offers  no 
improvement  over  the  simple  linear  case.  The  standard  error  of  estimate 
is  increased  by  $430  and  the  coefficient  of  variation  is  higher  by  7 
percentage  points.  The  explained  variation  is  higher  by  5  percentage 
points,  but  it  is  questionable  whether  such  an  improvement  Is  significant 
In  a  stat  1st  leal  sense. 

if 

This  equation  is  shown  in  slightly  different  form  in  Methods  of 
Correlation  and  Regression  Analysis.  Third  Edition,  by  M.  Ezekl.il  and 
K.  A,  Fox,  John  Wiley  and  Sons,  1959,  p.  301. 
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Standard  error  of  estimate 

Coefficient  of  variation 

Coefficient  (index)  of  deter¬ 
mination  (unod lusted) 

Coefficient  (index)  of  corre¬ 
lation  (unadjusted) 


S i tup  1  e  linear 
regress  ion 

Second-decree 
repress  ion 

55,800 

S6.240 

.  64 

,  71 

.  32 

.37 

.  5/ 

.61. 

It  is  conceivable  when  dealing  with  a  small  sample  of  data  that  the 
differences  in  statistical  measures  presented  above  could  be  due  purely 
to  sampling  error.  In  this  case,  for  example,  the  difference  between 
two  (unadjusted)  coefflc'ents  of  determination  is  .05.  A  statistical 
test  might  indicate  that  the  chances  are  very  small  that  two  random 
samples  drawn  from  l he  assumed  population  would  have  a  difference  as 
large  as  this.  In  other  words  it  would  seem  highly  unlikely  that  the 
observed  difference  could  be  due  to  sampling  variation.  Tl'  this  were 
the  case,  the  difference  between  the  linear  regression  and  the  second- 
degree  regression  would  be  considered  significant. 

A  simple  test  to  determine  whether  the  incremental  increase  in 

2 

explained  variance  associated  with  the  addition  of  the  variable  x 

(or  any  additional  variable)  ii-  significant  involves  the  use  of  the 

★ 

statistic  F  An  F-test  indicates  whether  the  increase  in  explained 
variance  is  significant  in  relation  to  the  remaining  unexplained  vari¬ 
ance.  In  this  case: 


j.-  „  Increment  of  explained  variance  -r  degrees  of  freedom 
remaining  unexplained  variance  j.  degrees  of  freedom 

Tills  can  be  rewritten 


F 


'  *V)/l 

(1  -  x’*Vl 


See  Crux ton,  et  al 


p .  ^27, 


/ 
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wtiere 

2  2 

Tj  «  r  of  linear  regression 
2  2 

Tj  *  r  of  2d  degree  regression 

As  explained  earlier,  the  degrees  of  freedom  are  generally  the  sample 
size  minus  the  number  of  parameters  in  the  regression  equation,  and 
this  holds  true  for  the  denominator  of  the  above  expression  (10-3  «  7). 
In  the  numerator  only  one  degree  of  freedom  is  involved,  the  incremental 
degree  of  freedom  lost  by  adding  another  constant  to  the  estimating 
equation . 

2 

Substituting  r  values  in  the  above  formula1 

,37  -  ,32  .05 

7  -  (1  -  . 3 7) / 7  "  .63/7 

*  .  56 

This  falls  far  short  of  the  critical  F  value  of  5.59  (at  a  .05  level 

ic 

of  significance),  Indicating  that  the  additional  explained  variance 

is  not  considered  significant.  In  other  words  the  net  increment  of 

2 

explained  variance  associated  with  the  introduction  of  x  (after  al¬ 
lowance  for  the  loss  of  an  additional  degree  of  freedom)  is  not  suf¬ 
ficient  to  allow  us  to  be  reasonably  confident  that  the  improvement 
is  not  due  to  chance. 

MULTIPLE  REGRESSION  ANALYSIS 

Previously  the  simple  linear  regression  example  was  extended  by 

2 

introducing  the  variable  x  into  the  estimating  equation.  At  this 
point  we  shall  go  back  to  the  simple  linear  case  and  consider  some  of 
the  possibilities  in  a  multivariate  analysis,  e,g.: 

•ff 

Most  statistics  texts  contain  an  F  table  showing  values  for  levels 
of  significance  from  .05  to  .001.  The  F  value  of  5.59  is  given  for  a 
numerator  of  l  degree  of  freedom  and  a  denominator  of  7. 
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1.  Introduce  power  or  frequency  into  the  equation. 

2.  Abandon  weight  in  favor  of  power  and  frequency. 

3.  Use  three  explanatory  variables,  i.e.,  power,  weight,  and 
frequency. 

At  this  point,  two  technical  considerations  must  be  raised.  The 
first  is  a  stipulation  that  in  the  multiple  regression  model  to  be 
used,  the  explanatory  variables  must  be  non-correlated .  If,  for  example, 
weight  and  power  output  were  correlated,  the  addition  of  weight  would 
not  make  a  statistically  significant  contribution  to  the  explanation 
of  cost.  The  inclusion  of  a  non-significant  variable  is  undesirable 
for  a  very  practical  reason:  it  is  almost  as  likely  to  move  the  result 
away  from  an  accurate  estimate  as  toward  it. 

Hence,  before  deciding  whether  weight  can  be  used  in  conjunction 
with  power  output  and  frequency  the  relationship  between  them  must  be 
examined.  While  there  are  statistical  techniques  for  testing  whether 
or  not  a  significant  correlation  exists  between  two  variables,  a  simpler 
procedure  is  to  examine  scatter  diagrams  for  one  plotted  against  the 
other.  From  Fig.  Ill- 11  it  is  clear  that  no  association  exists  between 
weight  and  frequency  and  very  little  between  weight  and  power  output. 

The  second  consideration  is  that  a  sample  of  10  will  bareLy  support 
simultaneous  inferences  about  the  effects  of  two  explanatory  variables. 

To  obtain  a  regression  equation  of  satisfactory  reliability  with  three 
independent  variables  the  sample  should  contain  at  least  20  observations. 
Consequently,  we  shall  limit  our  exploration  here  to  the  following 
combinations  uf  variables:  weight  and  power  output,  weight  and  fre¬ 
quency,  and  power  output  and  frequency. 

The  estimating  equation  for  linear  multiple  regression  analysis 
is  of  the  form: 


z  =  a  +  bx  +  cy 

And  for  the  above  combinations  of  variables  the  regression  equations 
are  as  fol lows : 
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C  *  113,85  -  .4523  W  -  .1308  F 
C  -  2,9303  +  .07338  W  +  .004705  P 
C  -  -.5257  +  .04258  P  +  .02749  F 


where : 


C  «  cost  F  m  frequency 

W  »  weight  P  ■  power  output 


The  various  statistical  measures  of  each  are  compared  below  with  those 
obtained  for  weight  alone. 


Weight  + 

Weight  + 

Power  + 

Weight 

Frequency 

Power 

Frequency 

Standard  error  of  estimate  (S) 

$5,800 

$137,145 

$6,190 

$5,000 

Coefficient 

of  variation  (C) 

.64 

2.83 

.68 

.55 

Coefficient 

of  determination 

(R>2 

.32 

.04 

.33 

.56 

Coef f ic ient 

of  correlation  (R) 

.57 

.2 

.57 

.  75 

The  addition  of  frequency  degrades  the  estimating  relationship 
tremendously,  giving  a  coefficient  of  correlation  close  to  zero.  Weight 
and  power  together  are  not  as  good  as  weight  alone,  and  the  only  im¬ 
provement  seen  is  for  the  combination  of  power  output  and  frequency. 
While  it  would  be  preferable  to  have  a  Lower  value  for  C  and  a  higher 
value  for  R,  this  combination  should  do  a  somewhat  better  job  of  pre¬ 
dicting  cost  than  would  weight  alone. 

Earlier,  we  examined  a  curvilinear  function  with  two  variables. 

A  non-linear  relationship  of  that  type  using  three  variables  can  be 
examined  here  in  an  attempt  to  Improve  the  reliability  of  the  equation. 
With  three  variables  the  equation  would  be  of  the  form 

b  c 

z  «  ax  y 
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Again  making  a  logarithmic  transformation  of  the  variables  to  facil¬ 
itate  computation  and  solving  for  the  constant  a,  b,  and  c,  we  obtain 

log  C:  -1.1933  +  .5756  log  P  +  .6085  log  F 


where! 


C  «  cost 
P  *  power  output 
F  *  frequency 


C  -  .00641  p-  5756  y6085 

This  equation  improves  the  fit  considerably  as  shown  by  the  comparison 
below  and  is  generally  satisfactory  on  logical  grounds  as  well  since 


Linear 

Curvilinear 

Standard  error  of  estimate  (S. 

±$5,000 

+$3,200,  -  $2,; 

Coefficient  of  variation  (C) 

.55 

+  .35,  -  .: 

Coefficient  (index)  of  determination  (R) 

.51 

.88 

Coefficient  (index)  of  correlation  (R) 

.71 

.94 

Values  at  the  sample  mean  ($9,060). 


power  and  frequency  should  be  causal Ly  related  to  cost.  Given  the 
limitations  inherent  in  a  sample  of  10,  the  above  estimating  relation¬ 
ship  is  probsbly  as  good  as  can  be  derived. 


DOCUMENTATION 

Once  an  estimating  relationship  has  been  developed,  a  written 
report  documenting  the  major  data,  assumptions,  and  analytical  results 
is  indispensable.  The  following  guidelines  for  such  a  report  are 
suggested . 
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1.  The  scope  and  coverage  of  the  study  and  the  resulting  equations 
should  be  fully  and  clearly  described. 

2.  Assuming  that  the  study  has  made  provision  for  a  survey  of 
work  already  performed  in  the  area  of  Interest  (a  very 
desirable  part  of  a  cost  research  study),  a  summary  of  the 
survey  results  should  be  presented. 

3.  The  major  input  data  used  in  the  study  should  be  provided. 

Both  the  raw  and  adjusted  data  should  be  documented  to  the 
extent  feasible,  This  includes  data  for  both  Che  dependent 
and  Independent  variables,  Data  should  be  included  not  only 
for  those  cost  categories  and  characteristics  Included  In  the 
final  estimating  equations,  but  also  for  those  major  char¬ 
acteristics  which  were  considered  but  were  dropped  in  the 
analysis.  Any  adjustments  to  the  raw  data  which  are  made 
should  be  fully  described  and  explained.  The  limitations  and 
some  indication  of  the  accuracy  of  the  data  should  be  pro¬ 
vided.  Since  one  of  the  outputs  of  a  cost  research  study  is 
the  data  base  itself  it  should  be  sufficiently  described  so 
as  to  be  usable  in  future  studies. 

4.  The  sources  and  dates  of  the  data  should  be  specifically 
identified, — 

5.  Each  dependent  and  independent  variable  considered  in  the 
study  should  be  fully  and  clearly  defined.  Unambiguous  def¬ 
initions  of  weapon  system  characteristics  and  cost  elements 
are  usually  considerably  more  involved  than  appears  at  first 
glance. 

6.  The  major  dependent  versus  s ingle- independent  variable  scatter 
diagrams  utilised  in  the  study  should  be  provided.  The  points 
on  the  diagrams  should  bo  labeled  to  identity  the  particular 
items . 
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7.  The  final  equations  plus  ether  major  equation  forms  examined 
in  the  study  should  be  presented  along  with  such  statistics 
as  the  standard  error,  correlation  coefficient,  coefficient 
of  variation  and  prediction  intervals  (to  the  extent  derived) 
for  each  equation.  Any  other  criteria  felt  appropriate  for 
indicating  the  goodness  of  fit  and  prediction  r -<pabi  l  it  ies  of 
tlic  equations  should  be  provided. 

8.  For  the  major  final  equations,  tables  such  as  Table  111*3 
should  be  presented  which  show  the  observed  values  of  the 
dependent  variables,  the  estimated  values,  the  deviations, 
and  the  perrent  deviation  from  the  observed.  The  average 
percent  deviation  for  the  sample  should  also  be  presented. 
This  not  frequently  used  statistic  i<  felt  to  provide  a  good 
and  easily  understood  measure  of  th  ■  goodness  of  fit. 

Table  III-3 

ACTUAL  AND  ESTIMATED  COSTS  OF  AIRBORNE  COMMON I CATION S  EQUIPMENT 


Actual 

Estimated 

Percent 

Coat 

Cost 

Deviation 

Deviation 

$227200 

$13,700 

-$8,500 

-38 

17,300 

16,000 

- 

1,300 

-  8 

11,800 

17,400 

+ 

5,600 

+47 

9,600 

9,200 

- 

400 

-  4 

8,800 

9,200 

+ 

400 

+  5 

7,600 

6,400 

- 

1,200 

-16 

6,800 

6,900 

+ 

100 

+  l 

3,200 

4,600 

- 

1,400 

-44 

1,700 

2,000 

+ 

300 

+18 

1.600 

1,300 

- 

300 

-19 

Average  percent 

deviation 

20 

*Note ,  however,  that  this  is  not  the  function  minimized  when  using 
the  least  squares  technique  for  obtaining  the  equation  coefficients. 


In  addition,  a  scatter  diagram  plotting  the  observed  versus 
estimated  values  for  the  sample  should  be  presented  (see  Fig. 
Ill- 12),  The  points  on  the  diagram  should  be  labeled  to 
identify  the  particular  items. 

The  major  alternative  equations  which  were  considered  in  the 
study,  but  rejected,  should  be  described  sufficiently  for  the 
reader  to  understand  which  were  considered  and  why  rejected. 
The  reader  should  he  vtven  some  feeling  for  the  improvement 
gained  by  the  selection  of  the  final  recommended  forms  over 
these  other  major  alternatives.  Alternative  equations  could 
involve  such  aspects  as: 


'  HF 
®400  w 


6  w  a/'  HF 

/®100w 


>VHF 
0.5  w 


Actual  cost  (thousands  of  dollars) 

Fig.  1 1! -12 — Estimated  versus  actual  costs 
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a.  The  use  of  different  Independent  variables; 

b.  Different  forms  of  the  equations,  o.g.  ,  linear,  multiplier 
atlve  ( i ,  e ,  ,  linear  in  the  logs)  or  non-linear  forms; 

c.  The  use  of  different  forms  of  the  dependent  variables, 
c.g.,  cost  per  pound  or  cost  per  Item; 

d.  The  use  of  stratified  dependent  Variables  grouped  Into 
sub-categories  determined  by  such  factors  as  ship  or  missile 
type,  weight,  frequency,  speed  regime,  etc, 

10.  A.'.y  'oeclal  methodology  should  be  described,  perhaps  in  an 
appendix  If  only  of  specialized  Interest  (such  as  a  sophis¬ 
ticated  mathematical  approach), 

11.  The  methods  used  should  be  described  fully  and  clearly,  It 
should  be  possible  from  the  information  presented  in  the 
report  for  a  reader  to  reconstruct  from  the  same  data  base 
(though  not  necessarily  agree  with)  Che  results  of  the  study, 
The  major  assumptions,  both  statistical  and  otherwise,  used 

in  the  derivation  of  the  equations  should  be  explicitly  stated. 

12.  An  example  to  illustrate  the  procedure  for  using  the  final 
cost -estimating  relations  is  always  helpful. 

J _ Ihe_ limitations  of  the  final  equations  shou Id , -to— the  extent 

possible,  be  clearly  delineated  and  be  as  specific  as  pos¬ 
sible.  The  range  of  characteristics  over  which  the  esti¬ 
mating  procedure  applies  should  be  clearly  stated  as  well 
as  any  other  restrictions  on  the  population  covered  by  the 
equations . 
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IV.  USIMS  ESTIMATING  RELATIONSHIPS 

The  widespread  uae  of  estimating  relationship*  in  the  form  of 
simple  cost  factors,  equations,  curves,  nomograms,  and  rules  of  thumb 
attests  to  their  value  ami  to  the  variety  of  situations  in  which  they 
can  be  helpful.  Vet  an  estimating  relationship  can  only  be  derived 
from  information  on  what  has  occurred  in  the  past,  and  the  past  is  not 
always  a  reliable  guide  to  the  future,  As  all  horscplayers  know,  the 
favorite  run*  out  of  the  money  often  enough  to  prove  that  an  estimate 
based  on  past  performance  is  quite  likely  to  be  wrong.  Admittedly, 
there  may  be  other  factors  at  work  in  the  case  of  the  racuhorse  ,  but 
the  problem  remains  the  same  as  that  encountered  in  any  attempt  to 
predict  the  course  of  future  events,  i.e.  ,  how  much  confidence  can  be 
put  in  the  prediction?  This  question  dominates  all  others  in  any 
discussion  of  the  use  of  estimating  relationships. 

These  remarks  are  not  intended  to  donrociato  the  value  of  estimat¬ 
ing  relationships,  They  comprise  the  most  important  tool  in  an  estima¬ 
tor's  kit  and  arc*  In  many  cases  the  only  tool.  This  being  the  case, 
it  is  essential  that  their  Limitations  be  understood  so  that  they  will 
not "be  used  improperly,  These  limitations  stem  from  two  sources : 

(1)  the  uncertainty  inherent  in  any  application  of  statistics  and 

(2)  the  uncertainty  that  an  estimating  relationship  is  applicable  to 
a  particular  article.  The  first  pertains  primarily  to  articles  welL 
within  the  bounds  of  the  sample  on  which  the  relationship  is  based  and 
says  that  uncertainty  exists  even  here,  The  second  refers  to  those 
cases  where  the  article  In  question  has  characteristics  somewhat  dif¬ 
ferent  from  those  of  the  sample.  Extrapolating  beyond  the  sample, 
although  universally  deplored  by  statisticians ,  Is  universally  practiced 
by  cost  analysts  dealing  with  advanced  hardware  since  ir.  most  cases 

it  is  precisely  those  systems  outside  the  range  of  the  sample  that  are 
of  Interest,  The  question  is  whether  the  equation  is  relevant  to  the 
case  at  hand  even  though  good  statistical  practice  would  question  its 


use , 
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UNPERSTANDINO  THE  ESTIMATING  RELATIONSHIP 

Sometimes  8t>  much  emphasis  Is  placed  on  statistical  treatment  of 
the  data  that  a  fundamental  point  is  over  looked --an  estimating  relation¬ 
ship  must  be  reasonable  and  must  have  predictive  value. 

Reasonableness  can  be  tested  In  various  ways--by  inspection,  by 
simple  plots,  and  by  some  fairly  complicated  techniques  which  involve 
an  examination  of  each  variable  over  a  range  of  possible  values.  In¬ 
spection  will  often  suffice  to  indicate  that  an  estimating  relationship 
is  not  structurally  sound,  for  example,  the  following  equation  resulted 
from  an  exercise  at  the  Air  Force  Institute  of  Technology  in  which  stu¬ 
dents  were  asked  to  develop  coat-estimating  relationships  for  small 
missiles : 

C  -  8347.5  +  150. 6W  -  1149. 1R 

where 

C  -  cost  of  airframe  +  guidance  and  control 
W  *  weight  (lh) 

R  *  range  (mi) 

This  equation  fits  the  data  very  welL,  but  it  says  that  as  range  In¬ 
creases,  cost  decreases,  and  this  intuitively  seems  wrong.  If  cost  is 
a  function  of  range,  we  would  expect  the  relationship  to  be  direct 
rather  than  inverse.  To  investigate  further,  we  can  choose  two  hypo¬ 
thetical  but  reasonable  values  for  W  and  R  within  the  range  of  the 
sample  data  (38.5  -  157  lb  for  Wj  5,0  -  14.8  mi  for  R) .  As  the  Table 
below  shows,  Missile  A,  although  heavier  and  with  greater  range  than 
Missile  B,  is  estimated  to  be  the  cheaper  of  the  two.  This  is  contrary 
to  moat  experience  and  suggests  that  a  re-examination  of  the  sample  data 
and  the  equation  is  in  order. 

Weight  of  Airframe  + 

Hypothetical  Guidance  and  Control  Estimated  Cost  of  Airframe  + 

Missile  _ (lb) _ Range  (mi)  Guidance  and  Control  ($) 


A 

B 


50 

75 


5 

10 


11,133 

8,153 
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Gometimes  an  estimating  relationship  is  developed  to  make  a  par¬ 
ticular  estimate,  but  has  no  predictive  value  outside  a  very  narrow 
range.  As  an  example  of  this,  consider  the  following  equation  for 
estimating  the  cost  of  solid  propellant  motors  for  small  missiles: 

Cost  -  1195.6  +  .000003  t2 

where  I 


I  m  total  impulse 


equation  fltB 

the  sample  data 

very  well : 

Missile 

Motor 

Observed  Cost 

Estimated  Cost 

A 

$2600 

$2660 

B 

1700 

1693 

C 

1250 

1265 

D 

1750 

1781 

If  it  were  aopropriate  to  use  statistical  measures  for  a  sample  of  four, 

one  coul  ’  say  that  this  relationship  explains  over  99  percent  of  the 

total  variance.  But,  note  that  the  constant  1195.6  jc counts^  for  94 _ 

percent  of  the  cost  of  Motor  C  and  that  the  cost  of  all  motors  smaller 

2 

than  D  will  be  about  $1200.  On  the  other  hand,  because  of  the  I  term, 
the  influence  of  total  impulse  Is  Likely  to  be  too  pronounced  for  motors 
larger  than  those  in  the  sample. 

A  common  method  of  examining  the  implications  of  an  estimating  re¬ 
lationship  for  values  outside  the  range  of  the  sample  is  to  plot  a  scal¬ 
ing  curve  as  illustrated  in  Fig.  IV- l.*  The  theory  underlying  a  scaling 
curve  is  that  as  an  item  increases  in  weight  (or  some  other  dimension) 
the  incremental  cost  of  each  additional  pound  (or  square  foot,  watt, 
horsepower,  etc.)  will  decrease  or  increase  in  some  predictable  way. 

Thus,  in  Fig,  IV-l  the  cost  per  pound  of  an  electrical  power  subsystem  in 
a  manned  spacecraft  decreases  from  about  $4200  to  $1400  as  the  total  weight 

★ 

Scaling  curves  may  be  plotted  on  either  arithmetic  or  logarithmic 
graph  paper  as  shown  in  Fig.  IV-l.  Because  the  log-linear  representation 
is  more  convenient  to  work  with,  this  is  the  one  generally  used  by  co-;t 
analysts . 
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Fig.  I V-l  — Scaling  curve:  cost  per  pound 
versus  dry  weight 


increases  from  100  to  1000  lb.  The  slope  of  the  curve  is  fairly  steep, 
and  if  the  curve  were  extended  to  the  right,  one  might  expect  to  see 
some  flattening.  Eventually,  the  cl  might  become  completely  flat 
when  no  more  economies  of  scale  can  t»  retried,  but  it  is  unlikely 
that  the  slope' would  ever  become  positive. 

Now  examine  Fig.  IV-2  where  total  impulse  is  plotted  against  cost 
per  pound-second  based  on  values  obtained  from  the  estimating  relation¬ 
ship  above.  Two  differences  are  immediately  seen.  First,  the  left- 
hand  portion  of  the  curve  is  unusually  steep.  Second,  the  slope  be¬ 
comes  positive  when  total  impulse  exceeds  about  24,000  lb-sec.  In  some 
instances,  fabrication  problems  increase  with  the  site  of  the  object 
being  fabricated  and  a  positive  slope  may  result.  No  such  problems  are 
encountered  in  the  manufacture  of  small  solid  propellant  rocket  motors, 
however,  and  continued  economies  of  scale  are  to  be  expected. 

A  final  point  to  be  made  about  Fig.  IV-2  is  that  a  more  useful  esti¬ 
mating  relationship  could  have  been  obtained  by  drawing  a  trend  line 
than  by  fitting  a  curve  to  the  four  data  points.  With  a  small  sample, 
it  is  often  possible  to  write  an  equation  that  fits  the  data  perfectly, 
but  is  useless  outside  the  range  of  the  sample.  Statistical  manipulation 
of  a  sample  this  size  rarely  produces  satisfactory  results. 
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Fig.  I V-2 — Cost  per  pound-second  versus  total  impulse 


A  final  example  of  the  kind  of  error  that  undue  reliance  on  sta¬ 
tistical  measures  of  fit  may  give  rise  to  is  based  on  an  estimating 
equation  for  aircraft  airframes.  Initially,  an  equation  for  estimating 
airframe  production  labor  hours  was  based  on  a  sample  of  44  aircraft. 

It  then  seemed  that  grouping  aircraft  by  type  should  give  better  cor¬ 
relation,  and  in  fact  by  considering  bombers,  fighters,  trainers,  etc., 
separately  the  average  deviation  between  estimates  and  actual  values 
was  markedly  reduced.  In  the  case  of  trainer  aircraft,  for  example, 

average  deviation  was  reduced  from  20  to  6  percent,  and  a  more  useful 

estimating  relationship  obtained.  In  the  case  of  fighters,  however, 
while  average  deviation  was  reduced  from  15  to  11  percent,  the  esti¬ 
mating  equation,  shown  below,  had  a  visible  flaw: 

j  no  / 

Manhours/lb  »  4.28  (weight)  (speed)’ 

The  flaw  is  that  the  exponent  of  weight  is  greater  than  1.0,  and 

this  means  that  when  speed  is  held  constant  and  weight  increased,  the 

manhours  per  pound  of  airframe  weight  will  increase.  This  can  be  seen 
in  Fig.  IV-3.  The  dashed  lines  show  scaling  curves  derived  from  the 
total  sample  of  44  aircraft  These  portray  the  normal  relationship-- 
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Fig.  I V-3— Comparison  of  regression  lines 
with  scaling  curves 


as  weight  increases j  hours  per  pound  decrease.  The  regression  equation 
gives  the  opposite  results  because  the  general  trend  in  fighter  aircraft 
~fias  been  for'  increased  speed  to  be  accompanied  by  increased  weight,  and 
this  causes  an  emphasis  on  the  weight  variable.  One  cannot  assume, 
however,  that  all  new  fighters  will  conform  to  this  trend;  and  the  equa¬ 
tion,  if  used  at  all,  would  have  to  be  used  with  great  care. 

The  advice  is  frequently  given  that  one  should  not  use  an  esti¬ 
mating  relationship  mechanically.  This  implies  two  things:  (1)  that 
the  function  must  be  thoroughly  understood  and  (2)  that  the  hardware 
involved  must  be  understood  as  well.  To  illustrate  the  former,  let  us 
examine  an  estimating  relationship  for  direct  manufacturing  hours  de¬ 
rived  from  a  sample  of  Navy  and  Air  Force  airframes: 


100 


1.45W7V43, 
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where  ■  manufacturing  labor  hour*  required  to  produce  the  100th 

airframe , 

W  «  gross  takeoff  weight  (lb), 

S  =»  maximum  speed  (kn). 

The  multiple  correlation  coefficient  is  0.98  and  the  coefficient  of 
variation  is  .016  (in  logarithmic  terms).  Despite  these  very  satisfac¬ 
tory  measures  of  fit,  it  is  always  Interesting  to  compare  the  actual 
hours  for  each  airframe  in  the  sample  with  those  estimated  by  the  equa¬ 
tion  to  get  a  better  understanding  of  how  the  relationship  relates  to 
the  real  world.  In  such  a  comparison,  as  shown  by  the  summary  table 
below,  33  percent  of  the  estimates  differ  from  the  actuals  by  more  than 
20  percent,  and  7  percent  differ  by  more  than  30  percent.  These  figures 
Imply  that  a  person  who  has  nothing  to  rely  on  but  the  estimating  re¬ 
lationship  may  or  may  not  come  up  with  a  good  estimate.  However,  if 


Difference  Between 
Actual  Hours  and 
Estimated  Hours 

Number 

of 

Airframes 

Percentage 

of 

Sample 

10%  or  less 

15 

56 

LI- 20% 

3 

11 

21-  30% 

7 

26 

31-40% 

2 

7 

the  poorer  results  can  be  explained  in  some  way,  the  analyst  is  then 
in  a  much  better  position  to  understand  the  strengths  and  weaknesses 
of  the  equation. 

Since  this  estimating  relationship  Is  based  on  gross  takeoff 
weight  and  maximum  speed,  an  initial  hypothesis  to  explain  the  varia¬ 
tions  might  be  that  at  one  end  of  the  weight  or  speed  range  or  for 
some  combinations  of  weight  and  speed,  the  estimates  decrease  in  quality. 
In  this  case,  however,  as  shown  by  Fig.  IV-4,  the  poorer  estimates  are 
scattered  throughout  the  sample,  thus  indicating  no  consistent  bias 
because  of  the  explanatory  variables. 

A  second  hypothesis  might  be  that  the  manufacturing  history  of 
the  airframes  in  the  sample  should  explain  the  discrepancies,  and,  in 


Maximum  speed 


-  78- 


gene  ral ,  this  hypothesis  seems  valid.  Of  the  nine  airframes  in  the 
sample  for  which  estimates  differed  from  actuals  by  20  percent  or  more, 
several  were  considered  "problem"  airframes,  that  is,  airframea  where 
the  manufacturer  had  an  abnormal  number  of  problems  in  meeting  weight 
and/or  performance  specifications.  Interestingly  enough,  these  were 
not  aircraft  in  which  a  major  state-of-the-art  advance  wee  being  at¬ 
tempted.  Another  cause  for  dlscrepsncy  was  discovered  to  be  inter- 
speraion  of  different  models  of  the  same  aircraft  in  a  tingle  lot--e.g.  , 
reconnaissance  versions  of  s  bomber  were  interspersed  among  bomber 
alrframes--and  changes  of  this  kind  Increase  direct  labor  requirements. 
The  two  airframes  for  which  the  estimates  were  the  poorest,  requiring 
almost  40  percent  less  labor  than  the  equation  predicts,  were  vastly 
different  ones--a  large  transport  and  a  supersonic  fighter.  One  of 
these  benefited  from  the  manufacturer 's  concurrent  experience  with  a 
commercial  aircraft  of  similar  configuration.  The  other  cannot  be  ex¬ 
plained;  it  simply  appears  that  the  labor  content  of  this  aircraft 
was  unusually  low. 

However,  while  it  never  is  possible  to  resolve  all  the  uncertain¬ 
ties,  with  information  such  as  this,  an  estimator  can  feel  reasonably 
confident  that  the  estimating  relationship  does  not  contain  a  systematic 
bias,  that  it  should  be  applicable  to  normal  production  programs,  and 
that  it  provides  reasonable  estimates  throughout  the  breadth  of  the 
sample. 

UNDERSTANDING  THE  HARDWARE 

This  sample  Included  aircraft  with  gross  takeoff  weights  of  6,100 
lb  to  450,000  lb  and  maximum  speeds  of  from  300  kn  to  1,200  kn.  Sup¬ 
pose  a  proposed  new  aircraft  has  a  gross  weight  of  500,000  lb  or  a 
maximum  speed  of  1,700  kn.  Should  the  estimating  equation  be  used 
here?  The  same  question  could  arise  for  an  aircraft  whose  weight  and 
speed  are  In  the  sample  range,  but  la  to  be  fabricated  by  a  new  process 
or  out  of  a  new  material.  Again,  the  estimator  must  decide  whether 
the  equation  la  relevant  or  how  It  can  be  modified  to  be  useful.  All 
of  this  points  to  the  fact  that  an  estimating  relationship  esn  be  used 
properly  only  by  a  person  familiar  with  the  type  of  equipment  whose 
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cost  Is  to  be  estimated,  To  say  that  a  person  estimating  the  cost  of 
a  destroyer  should  know  something  about  destroyers  may  be  a  truism, 
but  an  estimator  is  sometimes  far  removed  from  the  actual  hardware, 
Further,  he  may  be  expected  to  provide  costs  for  everything  from  air- 
to-air  missiles  one  week  to  a  new  anti-ballistic  missile  system  the 
next.  The  tendency  in  such  a  situation  may  be  to  use  whatever  equation 
looks  best  without  taking  a  detailed  look  to  determine  whether  it  realty 
is  applicable  or  not. 

To  illustrate  the  problem,  let  us  assume  that  a  new  bomber  Is 
proposed  with  a  gross  weight  of  450,000  lb  and  a  maximum  speed  of 
1,700  kn.  The  estimating  equation  discussed  above  may  be  inappropriate 
because  the  speed  is  so  far  beyond  the  range  of  the  sample.  On  the 
other  hand,  no  equation  exists  for  aircraft  in  that  spc< A  range,  and 
an  estimate  is  required,  This  may  be  regarded  as  the  normal  situation, 
and  one  has  no  choice  butto  make  do  with  what  is  available,  In  this 
example,  use  of  the  equation  gives  542,000  direct  labor  manufacturing 
hours . 

The  next  step  is  to  compare  the  result  with  other  somewhat  similar 
systems  Co  sec  if  the  estimate  appears  reasonable.  Thus,  in  this  in¬ 
stance  one  might  plot  hours  versus  gross  weight  for  several  other  large 
aircraft  as  in  Fig.  IV^5 . _ The  supersonic  bomber  (SSBj )  is  substant ia i ! y 


Gross  takeoff  weight  (thousands  of  lb) 

Fig.lV-5  -Trend  line  for  large  aircraft 


-80- 


abovc  the  trend,  and  this  is  as  It  should  be.  A  l,700-kn  airframe  i» 
going  to  be  more  difficult  to  build  than  a  *ub#onic  airframe  of  the 
same  si**,  and  lacking  any  ether  information  an  estimator  night  be 
inclined  to  accept  the  figure  of  S42,000  hr.  In  this  ceae,  however, 
ell  the  airframes  in  the  sample  /ere  fabricated  almost  entirely  of 
aluminum,  while  an  airframe  built  to  withstand  the  heat  generated  by 
sustained  flight  in  the  atmosphere  of  around  Kach  3  will  require  a 
metal  such  as  stainleas  ateel  or  titanium.  The  question  that  arises 
is  whether  the  speed  variable  in  the  equation  fully  accounts  for  this 
change  in  technology. 

One  way  to  approach  thia  question  la  to  plot  a  second  scatter 
diagram,  this  time  with  speed  as  the  independent  variable.  Figure  IV-6 
shows  labor  hours  per  pound  of  airframe  weight  plotted  against  speed 


Fig.  IV-6— Labor  hours  per  pound  versus  maximum  speed 
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wlth  a  calculated  Line  of  best  fit  drawn  through  the  scatter,  Assum¬ 
ing  an  airframe  weight  of  125,000  lb  out  of  a  gross  weight  of  450,000  lb, 
the  estimate  of  542,000  hr  is  equal  to  4,3  hr/lb  of  airframe  which 
(shown  on  fig,  IV-b  as  SSBj)  Is  not  only  below  the  calculated  trend  tine, 
it  is  below  any  reasonable  trend  line  that  can  be  drawn  through  the 
sample,  At  this  point,  we  might  say  that  we  have  three  estimates: 

542,0(>:)  hr  based  on  speed  and  weight,  about  300, 000  hr  based  on  weight 
alone  (from  Fig,  IV-5)  and  about  925,000  hr  based  on  speed  alone  (from 
Fig.  IV-6--7.4  hr/lb  x  125,000  lb  ^  925,000  hr).  More  information  is 
needed  to  narrow  this  range,  and  although  Information  on  this  subject 
is  something  less  than  abundant,  Several  experimental  and  prototype 
aircraft  have  been  fabricated  using  stainless  steel  and  titanium, 

One  manufacturer,  on  the  basis  of  his  experience  with  several 
prototypes,  maintains  that  a  titanium  airframe  requires  twice  the  hours 
of  an  aluminum  airframe,  This  is  interesting  but  not  very  helpful 
information  because  manufacturing  hours  for  an  aluminum  airframe  can 
vary  considerably.  A  secoud  indication  is  more  precise,  An  examination 
of  the  fragmentary  data  available  on  several  different  airframes  with 
spends  of  Mach  3  and  above  tends  to  show  that  they  require  about  1J 
times  as  many  hours  as  are  estimated  hy  the  estimating  relationship 
above,  This  implies  813,000  hr  or  b.5  hr/lb  for  the  supersonic  bomber. 
This  point  is  shown  as  SSB.,  on  Fig.  IV-6.  On  the  basis  of  what  is  cur¬ 
rently  known,  this  appears  to  be  u  reasonable  estimate.  One  could  go 
further,  of  course,  and  make  another  independent,  estimate  using  a  dif¬ 
ferent  estimating  relationship,  For  most  kinds  of  hardware,  an  esti¬ 
mator  does  not  have  this  option  because  estimating  relationships  are 
not  all  Chat  plentiful.  In  the  case  of  airframes,  however,  a  number 
of  equations  have  been  developed  over  the  years,  ami  it  is  a  good 
idea  to  use  one  to  confirm  an  estimate  made  with  another. 

JUDGMENT 

The  need  for  judgment  is  often  mentioned  In  connection  with  the 
use  of  estimating  relationships,  and  while  this  need  may  be  self-evident, 
one  of  the  problems  In  the  past  Is  that  there  has  been  too  much  judg¬ 
ment  and  too  little  reliance  on  estimating  relationships.  One  problem. 


that  of  Introducing  personal  bias  along  with  Judgment,  haa  been  studied 
in  other  contexts  and  the  conclusions  are  probably  applicable  here.  In 
brief,  it  appears  that  a  person's  occupation  or  position  strongly 
influences  his  forecasts,  Thus,  one  can  expect  to  find  a  consistent 
tandancy  coward  Low  estimates  among  those  persons  whose  best  interests 
are  served  by  low  estimates,  e.g.,  proponents  of  a  new  weapon  or  support 
system  whether  in  industry  or  government,  Similarly,  there  arc  people, 
again  both  in  industry  and  government,  whose  bread  is  buttered  on  the 
side  of  caution.  As  s  consequence,  their  estimates  are  likely  to  run 
higher  than  would  be  the  case  were  they  free  from  all  external  pres¬ 
sures.  (In  alt  fairness  to  this  latter  group,  however,  It  must  be  said 
that  overestimates  are  sufficiently  rare  to  suggest  that  caution  lo  not 
a  quality  to  be  despised.) 

The  primary  use  of  judgment  should  be  to  decide  (1)  whether  an 
estimating  relatlonahlp  can  be  used  for  an  advanced  system,  and  (2) 
if  so,  what  adjustments  may  be  neeeasary  to  take  into  account  the 
Impact  of  technology  not  present  in  the  sample.  Judgment  is  also  re¬ 
quired  to  decide  whether  the  results  obtained  from  an  estimating  rela¬ 
tionship  are  reasonable,  This  does  not  mean  reasonable  according  to 
some  preconception  of  what  the  cost  ought  to  be,  but  reasonable  when 
compared  to  what  similar  hardware  has  cost  in  the  past.  A  typical  test 
for  reasonableness  it  to  look  at  a  scattergram  of  costs  of  analogous 
equipment  at  some  standard  production  quantity  as  in  the-  sketch  below. 


The  estimate  of  the  article  may  be  outside  the  trend  lines  of  the 
scattergram  and  still  be  correct,  but  an  initial  presumption  exists 
that  a  discrepancy  has  been  discovered  and  this  discrepancy  must  be 
inveatigaled.  An  analyst  who  emergei  from  his  deliberations  with  an 
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estimate  Implying  that  new,  higher  performance  equipment  can  be  pro¬ 
cured  for  less  than  existing  hardware  knows  hia  taak  is  not  finished. 

If  after  some  research  he  is  convinced  that  the  estimate  is  correct, 
he  should  then  be  prepared  to  explain  what  naw  development  is  responsible 
for  the  decrease  in  cost. 

What  he  should  not  do  is  raise  the  cost  arbitrarily  by  some  percent 
to  make  it  appear  more  acceptable  or  because  he  has  a  visceral  feeling 
that  the  estimate  is  too  low.  (Visceral  judgments  are  the  province  of 
management  and  are  generally  occasioned  by  reasons  somewhat  removed 
from  those  discussed  here.)  Judgments  based  on  evidence  of  some  kind 
that  an  estimate  is  too  high  or  too  low  are  another  matter,  and  the 
only  injunction  to  be  observed  is  that  the  change  by  fully  documented 
so  that:  (l)  che  estimate  can  be  thoroughly  understood  by  others,  and 
(2)  the  equations  can  be  re-examined  in  the  light  of  the  new  data. 
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For  many  years  now  it  has  been  standard  practice  throughout  the 
aerospace  industry  to  make  use  of  what  have  been  variously  called 
"learning,"  "progress,"  "improvement,"  or  "experience"  curves  to  pre¬ 
dict  reductions  in  cost  as  the  number  of  items  produced  increases. 

Th*.s  learning  process  is  a  phenomenon  which  exists  in  many  industries; 
its  existence  has  been  verified  by  empirical  data  and  controlled  tests. 
While  there  are  several  different  hypotheses  about  she  exact  manner  in 
which  this  learning  or  cost  reduction  occurs,  the  main  content  of 
learning  curv-  theory  is  that  each  time  the  total  quantity  of  items 
produced  doubles,  the  cost  per  item  is  reduced  to  some  constant  per¬ 
centage  of  its  previous  value.  Alternative  forms  of  the  theory  refer 
to  the  incremental  (unit)  cost  of  producing  an  item  at  a  given  quantity 
or  to  the  average  cost  of  producing  all  items  up  to  a  given  quantity. 

If,  for  example,  the  cost  of  producing  the  200th  unit  of  an  item  is  80 
percent  of  the  cost  of  producing  the  100th  item,  the  cost  of  the  400th 
unit  is  80  percent  of  the  cost  of  the  200th,  and  so  forth,  then  the 
production  process  is  said  to  follow  an  80  percent  unit  learning  curve, 
If  the  average  cost  of  producing  all  200  units  is  80  percent  of  the 
average  cost  of  producing  the  first  100  units,  etc.,  then  the  process 
follows  an  80  percent  cumu lative  averaf  learning  curve. 

Either  formulation  of  the  theory  results  in  an  exponential  function 
that  is  linear  on  logar*1  iic  grids.  Figure  V- l  shows  a  unit  curve  for 
which  the  reduction  in  cost  is  20  percent  with  each  doubling  of  cumula¬ 
tive  output,  the  upper  figure  showing  the  curve  on  arithmetic  grids 
and  the  lower  on  logarithmic  grids.  The  arithmetic  plot  emphasizes 
an  important  point--that  the  percentage  reduction  in  cost  in  each  unit 
is  most  pronounced  for  the  early  units.  On  an  80  percent  curve,  for 
example,  cost  decreases  to  28  percent  of  the  original  value  over  the 
fir_c  50  units.  Over  the  next  50  units,  it  declines  only  five  more 
percentage  points,  i.e.,  down  to  about  23  percent  of  unit  one  cost. 


Cumulative  unit  number 


Fig. V-l — The  BO  percent  learning  curve  on  arithmetic  and  logarithmic  grids 
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The  factors  that  account  for  the  decline  in  unit  cost  aa  cumulative 
output  increases  are  numerous  and  not  completely  understood,  those  most 
commonly  mentioned  are: 

1.  Job  familiarisation  by  workmen,  which  results  from  the 
repetition  of  manufacturing  operations. 

2.  General  improvement  in  tool  coordination,  shop 
organization,  and  engineering  liaison. 

3.  Development  of  more  efficiently  produced  sub- 
assemblies  . 

4.  Development  of  more  efficient  parts-supply  systems, 

5.  Development  of  more  efficient  tools. 

6.  Substitution  of  cast  or  forged  components  for 
machined  components. 

This  is  not  a  complete  list  of  the  relevant  factors,  of  course, 
and  it  tends  to  understate  the  importance  of  the  item  sometimes  con¬ 
sidered  the  most  important-- labor  learning.  Labor  coat,  however, 
cannot  decline  through  experience  gained  by  workmen  unless  management 
also  becomes  more  efficient.  In  other  words,  It  is  also  necessary  for 
management  to  organize  and  coordinate  the  work  of  all  manufacturing 
departments  more  efficiently  so^that  parts  and  assemblies  wlll~f low 
through  the  plant  smoothly. 

Labor  cost  is  not  the  only  element  of  manufacturing  that  declines 
as  cumulative  output  increases.  A  learning  curve  also  exists  for  unit 
materials  cost.  The  materials  category  frequently  includes  a  great 
deal  of  purchased  equipment,  which  in  turn  includes  a  substantial 
number  of  engineering,  Cooling,  and  iabor  hours.  These  hours  decline 
as  production  quantities  increase,  and  the  contractor  who  buys  in  suc¬ 
cessive  lots  is  generally  able  to  negotiate  a  lower  price  for  each  lot. 
Decreases  in  raw  material  costs  are  generally  attributed  to  two  factors: 
as  cumulative  output  Increases,  (1)  the  workmen  learn  to  work  the  raw 
materials  more  efficiently  and  so  cut  down  spoilage  and  reduce  the 
rejection  rate,  and  (2)  management  learns  to  order  materials  from  sup¬ 
pliers  in  shapes  and  sizes  that  reduce  the  amount  of  scrap  that  must 
be  shaved  and  cut  from  the  pieces  of  sheet,  bar,  etc.,  to  fabricate 
the  item  of  equipment.  Substitution  of  forgings  for  machined  parts 
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also  reduces  the  amount  of  scrap  material.  An  additional  factor  probably 
responsible  to  a  lesser  extent  for  the  decline  in  materials  cost  is  the 
pricing  policy  of  the  raw  material  suppliers.  These  suppliers  generally 
reduce  the  price  per  pound  for  the  various  kinds  of  raw  materials  if  an 
order  is  sufficiently  Large.  While  the  learning  curve  pertains  to  cost 
reductions  as  materials  are  applied  to  successive  lots  and  not  to  re¬ 
ductions  due  to  volume  purchases,  segregation  of  the  two  effects  is 
imperfect.  This  accounts  for  some  of  the  difference  in  learning  curve 
slopes. 

A  third  major  component  of  coat--overhead--also  declines  with 
cumulative  output,  but  as  a  result  of  the  method  of  allocating  over¬ 
head,  not  because  of  a  perceptible  relationship  between  overhead  rate 
and  cumulative  output.  Direct  labor  hours  per  unit  decline  as  cumu¬ 
lative  output  increases  and  overhead  is  often  distributed  to  each  unit 
on  the  basis  of  direct  labor  coat  or  hours.  As  a  consequence,  it  is 
inappropriate  to  talk  about  a  learning  curve  for  this  element  of  cost. 

THE  LINEAR  HYPOTHESIS 

This  relationship  between  cost  and  quantity  may  be  represented  by 
an  exponential  (log- linear)  equation  of  the  form 


where  X  equals  cumulative  production  quantity.  The  relationship 
corresponds  to  a  unit  or  a  cumulative  average  learning  curve  according 
to  whether  C  is  the  coat  of  the  Xth  unit  or  the  average  cost  of  the 
first  X  units.  The  constant  a  is  the  cost  of  the  first  unit  produced. 
The  exponent  b  which  measures  the  slope  of  the  learning  curve,  beara 
a  simple  relationship  to  the  constant  percentage  to  which  cost  is 
reduced  as  the  quantity  is  doubled.  If  the  fraction  to  which  cost 
decreases  when  quantity  doubles  is  represented  byjj,  we  have 
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The  cumulative  average  cost,  A,  of  producing  N  units  is  then 


The  relationship  between  the  unit  curve  and  the  cumulative  average 
curve  is  shown  by  Fig.  V-2.  The  relationship  between  A  and  N  is  not 
log-linear;  however,  as  N  becomes  larger,  A  approaches  asymptotically 
the  value 


In  learning  curve  literature  the  term  "slope"  has  not  only  its 
usual  meaning  but  also  refers  to  this  percentage  reduction,  e.g.  ,  an 
80  percent  slope  means  a  curve  with  a  b  value  of  -.322. 
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which  differs  from  the  expression  for  unit  cost  only  by  the  constant 
factor,  l/(l+b).  Consequently,  if  unit  cost  has  been  estimated  at  a 
sufficiently  large  quantity,  the  cumulative  average  coat  for  the  same 
quantity  may  be  approximated  by  multiplying  the  unit  tnea.ure  by  l/(l+b). 

When  a  production  process  follows  a  cumulative  average  curve 
rather  than  a  unit  curve,  the  basic  functional  form  is  still  C  *  ax*5 
but  can  be  written  A  «  ax'*  where  A  is  the  average  cost  of  the  first  X 
units.  The  cumulative  cost  for  producing  N  units  is  simply  AM,  and 
the  unit  cost  is  obtained  from  the  equation 

.fx1+b  -  <x-ntrt] 


The  relationship  between  a  linear  cumulative  average  curve  and  the 
resulting  unit  curve  is  illustrated  by  Fig.  V- 3. 

These  equations  may  appear  cumbersome  to  work  with  but  in  practice 
much  of  the  work  involved  in  using  learning  curves  has  been  made  simpler 
by  the  preparation  of  tables  giving  the  relationship  between  cumulative 
total,  cumulative  average,  and  unit  cost  for  a  range  of  slopes  and 
quantities.4'4'  Tables  V-l  and  V*2  give  values  of  these  equations  for 

selected  slopes  and  quantities  when  a  is  equal  to  one.  Use  of  more _ 

detailed  tables  is  recommended,  but  to  determine  approximate  solutions 
for  values  not  listed,  one  may  interpolate  between  given  values  of 
quantity  and  slope. 

To  illustrate  how  the  tables  are  used,  assume  a  linear  unit 
curve  with  a  slope  of  95  percent.  From  the  first  row  in  Table  V-l,  it 
can  be  seen  that  the  cost  of  unit  2  is  95  percent  of  the  cost  of  unit  1. 
Similarly,  the  cost  of  unit  4  is  95  percent  of  the  cost  of  unit  2 

^Whether  or  not  a  quantity  is  sufficiently  large  so  that  the 
asymptotic  method  will  provide  a  good  approximation  depends  on  the 
slope  of  the  learning  curve.  For  the  80  percent  curve,  the  asymptotic 
method  produces  an  error  of  about  1  percent  at  quantity  100;  for  a  75 
percent  curve,  the  error  at  quantity  100  is  slmost  5  percent  and  does 
not  decrease  to  1  percent  until  a  quantity  in  excess  of  1,000  has  been 
reached. 

See  for  example.  The  Experience  Curves.  Vol.  I  (67-84%)  and  Vol. 
II  (85-99%),  Army  Missile  Command,  Redstone  Arsenal,  Alabama  (available 
from  the  Defense  Documentation  Center). 


Cost 


SLOPE  —  QUANTITY  FACTORS  FOR  THE  LOG- LINEAR  UNIT  CURVE 


SLOPE  —  QUANTITY  FACTORS  FOR  THE  LOG- LINEAR  CUMULATIVE  AVERAGE  CURVE 
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(.95  x  ,95  *  .903),  Thu»,  if  Ch*  cost  of  any  unit  i>  known,  the  coat 
of  any  other  can  b#  calculated  from  this  table.  For  example,  given 
the  value  of  unit  25,  unit  100  coat  would  be  obtained  from  the  ratio 
,711/, 788  or  .902,  i.e.,  the  100th  unit  would  be  90,2  percent  of  the 
coat  of  unit  25. 

Since  the  cumulative  average  curve  la  always  above  the  unit  curve, 
the  cumulative  average  coat  at  any  given  quantity  will  be  greater  than 
the  unit  coat,  Aa  ehown  in  Table  V-l,  the  cumulative  average  coot  of 
unit  2  la  .975  (the  average  of  unit  costa  of  1,0  at  unit  1  and  ,95  at 
unit  2),  To  move  quickly  from  the  unit  curve  to  the  cumulative  average 
curve,  a  simple  ratio  la  provided  in  the  bottom  portion  of  Table  V-l. 

It  la  probably  fair  to  aay  that  in  actual  practice  the  unit  coat 
ia  moat  frequently  conaldered  to  be  linear,  but  there  are  sufficient 
exceptions  to  this  statement  to  suggest  that  the  choice  la  a  matter 
of  preference  rather  than  necessity,  Once  the  choice  la  made,  however, 
It  ia  of  the  utmost  Importance  to  apply  the  technique  conaiatently. 

Aa  is  avldant  from  the  example  above,  confusing  one  type  of  curve  for 
the.  other  could  result  in  large  errors. 

HONLINEAR  HYPOTHESES 

_ Throughout  succeeding  sections  of-thta  -chapter— it  ia-assumed  that 

the  linear  hypothesis  applies,  i.e.,  that  the  learning  curve  ia  linear 
when  plotted  on  logarithmic  grid*.  It  must  be  mentioned,  however, 
that  this  is  not  the  only  possible  formulation  of  the  learning  curve. 

A  number  of  studies  have  indicated  that  the  curve  is  not  linear.  One 
of  the  beat  known  of  these  is  the  Stanford  Research  Institute  investi¬ 
gation  of  20  World  War  II  aircraft.  This  study  proposed 


aa  a  more  reliable  expression  of  the  relationship  betveen  manhour  cost 
and  cumulative  output.  Here  the  decision  to  find  a  substitute  function 
waa  apparently  prompted  by  a  visual  Inspection  of  several  aeries  that 
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seemed  to  Indicate  a  concavity  in  the  unit  learning  curve.*  This  con¬ 
cavity  esriv  in  the  series  has  been  recognized  independently  in  other 
studies. 

On  the  other  hand,  it  has  been  noted  in  some  cases  that  beyond 
certain  values  of  cumulative  output,  both  the  labor  and  the  production 
coat  curves  develop  convexities.  The  theory  of  a  linear  unit  curve 
implicitly  aesumee  that  its  component  curves  (e.g.  ,  fabrication,  sub¬ 
assembly,  and  major  and  final  aaaembly)  are  parallel  to  the  linear 
unit  curve,  and  this  implies  that  the  rate  of  learning  on  all  produc¬ 
tion  jobs  in  all  departments  ia  the  tame.  One  would  expect,  however, 
that  the  departmental  learning  curves  would  have  different  slopes  from 
each  other  (e.g.  ,  fabrication  might  be  90  percent;  subassembly,  85  percent; 
and  major  and  final  assembly,  70  percent).  If  thia  ia  the  case,  the 
sum  of  these  curves  (the  unit  curve)  would  approach  as  a  limit  the 
flattest  of  the  departmental  curves. 

A  considerable  amount  of  literature  la  available  describing  the 

bases  for  and  hypotheses  about  learning  curves ,  and  it  la  beyond  the 

scope  of  this  chapter  to  attempt  to  cover  this  background  material  in 
** 

any  detail.  A  list  of  some  of  the  most  useful  reports  on  the  subject  is 
appended  for  those  Interested  in  pursuing  the  matter  further.  For  our 
purpose  here ,  we  stipulate  that  the  learning  curve  has  become  a  useful 
and  accepted  estimating  tool,  particularly  in  the  aerospace  Industry,  that 
the  log- linear  curve  la  the  one  most  comaonly  used,  and  that  a  knowledge 
of  its  mechanics  is  indispensable  to  persons  making  or  using  cost 
estimates . 

it 

"Concavity"  in  this  context  means  that  when  viewed  on  log- Log 
paper  the  curve  declines  at  an  increasingly  steep  slope  as  it  moves 
away  from  the  y-axts.  In  the  formulation  C  ■  _g_  the  curve  becomes 

/ 

essentially  linear  as  X  becomes  large  relative  to  b. 

icif 

One  subject  not  discussed  at  all  concerns  the  effect  of  production 
rate  on  unit  coat.  Economic  theory  generally  holds  that  this  relation¬ 
ship  can  be  described  by  a  U-ahaped  function:  cost  declines  as  production 
rate  increases,  then  is  inaenaltive  to  rate  over  some  range  and  eventually 
begins  to  rise  again.  In  learning  curve  appl 'cations ,  on  the  other  hand, 
an  implicit  assumption  la  that  coat  Is  not  a£! ?cted  by  rate  of  output 
(or  that  the  rate  ia  constant).  Empirical  evidence  of  the  interaction 
between  the  volume  and  rate  effects  is  scanty,  but  for  a  good  illustra¬ 
tion  of  the  problem  see;  Preston,  L. ,  and  E.  Keachle,  "Cost  Functions 
and  Progreaa  Functions:  An  Integration."  American  Economic  Review. 

March  1964,  pp,  100-107. 


PLOTTING  A  CURVE 


The  graphical  representation  of  leat  -T  curves  Involves  the 
problem  of  representing  the  average  cost  fo,  t  lot  or  a  complete  con¬ 
tract,  since,  typically,  manhours  or  costs  are  not  recorded  by  unit. 
The  following  sample  illustrates  this. 


Lot 

Units 

Manufacturing 
Hours  per  Lot 

1 

1-10 

5,830 

2 

11-20 

4,370 

3 

21-50 

10,550 

4 

51-100 

14,750 

To  plot  a  cumulative  average  curve  from  these  data  the  cumulative 
average  hours  at  the  final  unit  in  each  lot  are  computed,  as  shown  be¬ 
low.  The  cumulative  average  at  the  10th  unit  is  583  hours;  and  this  i 
the  first  plot  point.  Successive  plot  points  are  at  the  end  of  each 
lot,  since  these  are  the  points  at  which  the  cumulative  average  hour 
figures  apply. 


Hot  Point 
(Unit) 

Manufacturing 
Hours  per  Lot 

Computation 

Cumulative 
Average  Hours 

10 

5,830 

(5.83OH0) 

583 

20 

4,370 

(10,200t20) 

510 

50 

10,550 

(20,750-50) 

415 

100 

14,750 

(35,500rl00) 

355 

To  plot  the  unit  curve,  however,  it  is  necessary  first  to  compute 
the  unit  hours  and  then  to  establish  plot  points.  The  unit  hours  can 
be  taken  as  an  average  for  each  lot,  that  is: 


Lot 

Cotnputat  ton 

Unit 

Hours 

1 

(5,830  4 10) 

5B3 

2 

(4,370  -MO) 

437 

3 

(10,550  4 30) 

352 

4 

(14,750  4  50) 

295 

The  lots  can  be  represented  by  these  unit  hour  values.  The 
question  is,  where  should  the  values  be  plotted?  To  plot  at  the  lot 
midpoint  is  to  assume  that  the  Learning  curve  can  be  approximated  by 
a  linear  curve  on  arithmetic  grids,  but  as  we  have  seen  from  Fig.  V- 1 , 
this  assumption  only  becomes  reasonable  after  a  number  of  units  have 
been  produced.  The  effect  of  choosing  the  arithmetic  midpoint  as 
the  plot  point  for  the  first  lot  is  illustrated  in  Fig.  V-4.  This 
figure  shows  that  for  a  learning  curve  plotted  on  arithmetic  grids, 
the  area  under  the  curve  from  A  to  the  midpoint  is  greater  than  that 
from  the  midpoint  to  B.  Only  when  the  algebraic  midpoint  is  chosen, 
which  is  somewhat  to  the  Left  of  the  arithmetic  midpoint,  will  the  area 
under  the  curve  be  equal  for  the  two  segments. 


A 


Fig. V-4- — Learning  curve  on  arithmetic  grids 


It  is  the  algebraic  midpoint,  then.  Instead  of  the  arithmetic  mid 
point  through  which  the  unit  curve  should  be  drawn  for  the  first  few 
Lota.  This  can  be  obtained  from  the  following  equation; 


where  K  »  algebraic  lot  midpoint, 

N2  *  first  unit  in  lot  minus  .5, 

Nj,  -  last  unit  in  lot  plus  .5, 

L  ■  number  of  units  in  lot, 

b  -  slope  of  learning  curve. 

Tables  allowing  rapid  computation  of  lot  midpoints  for  specific 
slopes  and  lot  quantities  are  also  available.  Note  that  this  pro¬ 
cedure  assumes  a  knowledge  of  the  learning  curve  slope.  Actually, 
an  approximation  of  slope  is  all  that  la  required  since  the  results 
are  not  very  sensitive  to  this  parameter. 

Lets  precise,  but  somewhat  handler  than  the  above  equation,  is 
the  graph  shown  in  Fig.  V-5  which  provides  plot  points  for  early  lot 
quantities  of  less  than  100.  These  points  represent  an  average  of  the 
range  obealned  from  65  and  95  percent  slopes.  The  graph  is  used  as 
follows : 

1.  First  unit  of  contract  lot  la  found  on  the  45-degree  line. 

2.  The  curve  extending  out  from  this  line  la  followed  to  the 
point  on  the  horizontal  axis  which  represents  the  last 
unit  of  the  lot. 

it 

See,  for  example,  PAMPER  (Practical  Application  of  Mid-Points 
for  Exponential  Regression)  Tables,  Army  Missile  Command ,  Redstone 
Arsenal ,  Alabama. 
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Fig.  V- 5— Plot  points  for  average  costs 


3.  The  plot  point  is  read  off  the  vertical  axis  at  that  point. 

Thus,  for  a  f 1 <s t  lot  of  10  units,  the  plot  point  would  be 
3.75. 

In  practice,  plot  points  for  only  the  first  two  or  three  lots,  or 
only  the  first  if  that  lot  comprises  more  than  about  25  units,  need  be 
taken  from  the  graph.  For  succeeding  lots,  the  arithmetic  lot  midpoint 
is  quite  adequate. 

The  point  here  is  not  to  recommend  any  particular  technique,  but 
rather  to  underline  that  the  problem  of  how  best  to  plot  the  represent¬ 
ative  unit  coats  for  iota  Is  important.  Cross  misplacement  of  early 
points  could  lead  to  improper  conclusions  about  the  cost-quantity  re¬ 
lationships  the  curves  are  Intended  to  represent. 

VARIATIONS 

The  examples  used  earlier  for  illustrative  purposes  rend  to  sug¬ 
gest  that  data  points  generally  fall  along  a  straight  line  as  one  would 
expect  from  the  linear  hypothesis.  The  sad  truth  is  that  plots  of  the 
type  illustrated  in  Fig.  V-6  are  not  unusual  and  that  fitting  a  curve  to 
these  points  is  more  than  a  matter  of  understanding  the  least-squares 
method  of  curve  fitting.  The  types  of  plots  seen  in  Fig.  V-6  are  conmon 
enough  to  have  been  given  names  in  the  airframe  industry.  The  "scallop" 
Is  generally  caused  by  a  model  change  or  some  other  major  interruption 
in  the  production  process.  The  characteristic  of  a  scallop  is  that  an 
abrupt  rise  in  manufacturing  hours  is  followed  by  a  rapid  decline  and 
the  basic  slope  of  the  curve  is  relatively  unchanged. 

When  a  model  change  is  sufficiently  great,  as  in  the  case  of  the 
change  to  the  F-106  from  the  F-102,  the  result  is  not  a  scallop  but 
a  change  to  a  new  curve.  In  this  case,  a  "leveling-off"  or  "follow-on" 
ta  characteristic  of  the  initial  portion  of  the  new  curve.  This  is 
attributed  to  learning  from  a  previous  model  that  carries  over  and 
flattens  the  curve  during  initial  production.  This  can  also  occur  when 
production  is  halted  for  a  long  period  or  where  production  is  transferred 
to  a  new  facility. 

"Bottoming- out"  is  the  tendency  for  a  learning  curve  to  flatten 
at  high  production  quantities.  Intuitively,  it  seems  reasonable  that 


Cumulative  unit  number* 


Fig. V- 6— Illustrative  examples  of  learning  curve  slopes 
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ac  some  point  no  further  learning  should  occur  or  that  whatever  alight 
learning  does  occur  would  be  offset  by  the  effect  of  small  changes. 

And  empirically  it  can  be  established  that  bottoming-out  has  occurred 
in  a  number  of  cases.  There  are  those  who  argue,  however,  that  learn¬ 
ing  can  continue  indefinitely,  or  at  least  as  long  as  the  attempt  is 
made  to  obtain  manhour  reductions,  snd  empirical  evidence  can  cited 
to  support  this  point  of  view.  The  classic  case  is  probably  that  of 
the  operation  involving  the  assembly  of  candy  boxes  where  the  learning 
curve  was  found  to  have  continued  for  the  preceding  16  years  during 

If 

which  16  million  boxes  were  assembled  by  one  person.  The  problem  for 

the  estimator,  of  course,  is  that  while  bottoming-out  may  occur  in  any 

given  case,  it  is  difficult  to  predict  where  it  will  occur.  One  study 

found  that  for  the  sample  of  airframes  examined  it  was  fairly  typical 

irk 

for  some  flattening  to  begin  at  the  300th  unit,  but  this  has  not  been 
true  for  many  airframes  in  the  past.  The  B-17  curve  maintained  about 
a  70  percent  slope  out  to  the  6000th  unit  and  then  exhibited  s  toe-up. 

"Toe-ups"  snd  "toe-downs"  are  the  names  given  to  the  rather  sharp 
rises  or  falls  in  hours  that  sometimes  occur  at  Che  end  of  s  production 
series.  The  upward  trend  has  baen  explained  as  resulting  from  the 
transfer  of  experienced  workers  to  other  production  lines,  an  increase 
_ in  the  amount  of  handwork  as  machines  are  disassembled,  failure  to  re¬ 
place  or  repair  worn  tooling  at  the  normal  rate,  tool  disassembly,  or 
from  labor  becoming  less  productive  at  the  end  of  a  program  so  as  not 
to  work  itself  out  of  s  job.  Toe-downs  are  felt  to  be  caused  by 
fever  engineering  changes  at  the  end  of  a  production  run  and  also  by 
the  ability  of  the  manufacturer  to  salvage  certain  types  of  items  fab¬ 
ricated  in  previous  lots. 

While  the  names  given  to  these  particular  variations  are  unimpor¬ 
tant,  it  Is  important  to  know  that  such  variations  occur- -not  occasion¬ 
ally  but  frequently.  In  the  analysis  of  manhour  or  cost  data  use  of  the 

*Glen  E.  Ghormley,  "The  Learning  Curve,"  Western  Industry. 

September  1952, 

irk 

Methods  of  Estimating  Fixed-Wing  Airframe  Costs.  Vol,  I  (Revised), 
Planning  Research  Corporation,  R-547A,  April  1967, 

***G.  M.  Brewer,  The  Learning  Curve  in  the  Airframe  Industry.  Air 
Force  Institute  of  Technology,  Report  SL5R- 18-65 ,  1$65.  ” 
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unit  curve  reveals  these  variations  and  is  generally  preferred  for  this 
reason.  The  cumulative  average  curve  tends  to  smooth  out  aberrations 
to  such  an  extent  that  even  major  changes  can  be  obscured.  Figure  V-7 
illustrates  this.  The  data  points  are  taken  from  a  fighter  aircraft 
production  program  which  had  more  than  its  share  of  problems.  The 
solid  line  shows  how  s  cumulative  average  curve  damps  out  the  effect 
of  these  problems.  The  choice  between  working  with  the  unit  or  the 
cumulative  average  curve  depends  upon  the  purpose  at  hand.  The  unit 
curve  better  describes  the  data  and  is  sometimes  preferred  for  this 


Fig. V  -  7— Smoothing  effect  of  cumulative  average  curve 
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reason,  On  the  other  hand  the  cumulative  average  curve  la  widely 
preferred  in  predictive  models  because  of  its  computational  simplicity, 
i.e.,  the  cost  of  N  items  is  simply  the  cumulative  average  cost  of  the 
Nth  items  times  N,  The  important  point  is  to  understand  both  well 
enough  to  be  able  to  chooae  intelligently  between  them. 

APPLICATIONS 

The  Learning  curve  is  used  for  a  variety  of  purposes  and  in  a 
variety  of  contexts;  and  how  the  curve  is  drawn  will  depend  on  the  pur¬ 
pose  and  the  context.  In  long-range  planning  studies,  for  example,  the 
curve  must  be  constructed  on  the  basis  of  generalized  historical  data 
and  the  possible  error  is  considerable.  Empirical  evidence  does  not 
support  the  concept  of  a  single  slope  for  all  fighter  aircraft,  all 
solid  propellant  missiles,  all  spacecraft,  etc.  The  practice,  there¬ 
fore,  of  assuming  that  manufacturing  hours  on  the  airframe  will  follow 
an  80  percent  curve  (as  was  common  for  many  years)  or  that  electronic 
equipment  will  follow,  say,  a  90  percent  curv:,  can  lead  to  very  Large 
estimating  errors. 

In  regard  to  airframes,  Table  V-3  shows  the  slope  of  the  manuiactur ing 
hour  curves  for  25  post-World  War  II  Air  Force  and  Navy  aircraft  and 
indicates  that  a  slope  steeper  than  80  percent  is  the  rule.  Since  the 
learning-curve  slopes  of  Table  V-3  show  important  differences  it  would  be 
desirable  to  relate  slope  to  aircraft  characteristics.  In  a  sense  a 

A 

technique  suggested  by  Planning  Research  Corporation  does  this.  Sep¬ 
arate  estimating  equations  based  on  aircraft  characteristics  are  derived 
for  four  different  production  quantitie*--10,  30,  100  and  300--and  a 
learning  curve  is  developed  from  the  estimates  at  these  four  points. 

On  s  theoretical  level,  however,  the  concern,  is  with  those  aircraft 
characteristics  which  influence  the  rate  of  learning.  In  this  regard 
it  seems  reasonable  to  expect  relatively  little  learning  for  a  model 
which  represents  a  small  modification  over  some  preceding  type  since 
the  previous  model  would  have  already  absorbed  a  considerable  learning 
effect.  On  the  other  hand,  if  an  aircraft  contains  radically  new  design 
features,  one  would  expect  a  high  initial  coat  followed  by  a  rapid 

* 

Op.  cit. 
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Table  V-  3 

LEARNING  CURVES  FOR  MANUFACTURING 
(Labor~-Airframe  only) 


LEARNING  CURVE  PERCENTAGE 


Aircraft 

Fighter . . . .  77 

Fighter .  73 

Fighter..... .  74 

Fighter . . .  73 

Fighter . . . .  78 

Fighter .  71 

Fighter . . .  74 

Fighter.. . .  76 

Fighter .  77 

Fighter . . .  79 

Fighter .  82 

Fighter... .  76 

Fighter .  75 

Fighter. . . . . . ......... ....... — 74 

Bomber... . . .  76 

Bomber . . .  73 

Bomber.. . . .  70 

Bomber .  71 

Bomber .  79 

Cargo . . . . . .  74 

Cargo .  76 

Cargo .  77 

Cargo .  75 

Trainer . . .  74 

Trainer . . .  75 

Mean .  75 

Standard  Deviation. . . .  2.7 


G.  S.  Levenaon  and  S.  M.  Barro,  Coat-Eati- 
gating  Ralatianahipa  fa>r  Aircraft  Airfremaa.  The 
BAND  Corporation,  RM-4645*FK  (Abridged),  May  1966. 
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decline  with  increased  production  quantities.  In  other  words  it  has 
been  suggested  that  the  "newness"  of  an  aircraft  should  be  a  major 
determinant  of  learning-curve  slope,  but  explicit  techniques  for  taking 
newness  into  account  have  yet  to  be  developed, 

For  good  estimating,  then,  learning  curves  must  be  established  on 
the  basis  of  historical  data  relevant  to  the  problem  at  hand,  They  are 
equally  applicable  to  missiles,  electronic  equipment,  aircraft,  ships, 
and  other  types  of  equipment,  but  the  slopes  may  be  quite  different  for 
each  of  these,  (A  recent  study  of  avionics,  for  example,  showed  slopes 
ranging  from  84  percent  to  91  percent  with  a  median  value  of  88  percent,) 
If  a  comparison  Is  being  made  between  two  weapon  systems,  one  involving 
aircraft  and  the  other  missiles,  the  learning  curve  slope  chosen  for 
each  could  play  a  significant  part  in  the  total  system  cost  comparison, 

In  an  appendix  to  this  chapter  the  effect  of  slight  variations  in  slope 
Is  shown  to  be  much  greater  than  is  generally  recognized.  To  cite  two 
examples;  The  effect  of  using  a  92  percent  rather  than  a  90  percent 
cumulative  average  curve  is  an  increase  of  25  percent  in  the  total  cost 
of  1,500  items.  As  one  would  guess,  the  situation  is  much  worse  when 
steeper  slopes  are  Involved.  Assuming  a  slope  of  62  percent  instead 
of  60  percent  results  in  a  42  percent  overstatement  of  the  cost  of 
1,500  items  and  s  25  percent  overstatement  of  the  cost  of  100  items. 

As  a  practical  matter,  errors  of  this  type  can  be  minimized  by 
originating  the  curve  at  the  estimated  cost  of  the  100th  unit  rather  than 
the  first,  The  table  below  shows  how  this  reduces  the  effect  of  a  two 
percent  change  in  slope  on  total  cost. 

Change  in  Total  Cost  of 
Change  in  Slope  _ 1 ,500  Units _ 

From  90%  to  92% 

Curve  originated  at 

Unit  1 .  25% 

Unit  100 .  9% 

From  60%  to  62% 

Curve  originated  at 

Unit  1 .  42% 

Unit  100 .  14% 

Once  a  few  data  points  are  available  either  for  developmental  or 
production  items,  the  situation  should  be  better,  but,  as  illustrated 
by  Fig.  V-8,  the  first  few  points  may  be  misleading.  Suppose  an  estimator 


(tpuwnoi^)  tJAOi^upyy 
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ha  d  been  asked  Co  estimate  the  cost  of  a  large  production  contract  after 
the  fabrication  of  the  firse  30  units.  By  fitting  a  curve  to  the  ex¬ 
isting  data  he  would  have  projected  a  learning  curve  with  about  an  88 
percent  slope  and  at  a  level  considerably  higher  than  that  later  ex¬ 
perienced.  In  such  a  situation  it  is  important  to  realize  that  an 
38  percent  learning  curve  for  airframe  production  is  unlikely.  In 
effect,  one  should  have  some  idea  of  what  the  answer  should  be  and 
differences  should  be  investigated. 

This  can  also  be  taken  as  an  example  of  the  small  sample  problem. 
Where  a  learning  curve  is  fitted  to  a  few  points,  the  correlation  may 
be  perfect,  that  is,  all  the  points  may  lie  on  the  fitted  line,  but  the 
results  can  still  be  unreliable.  The  points  used  in  fitting  must  be 
sufficiently  numerous  and  reasonably  homogeneous  with  the  points  implied 
by  extending  the  curve  to  offer  some  statistical  probability  of  success 
in  predicting  costs. 

The  most  important  informs  lion  an  estimator  faced  with  the  above 
problem  could  have  would  be  a  manufacturing  history  of  the  item  involv¬ 
ed.  Variations  from  the  norm  may  be  caused  by  particular  problems, 
configurations  changes,  or  changes  in  manufacturing  methods.  In  the 
curve  of  Fig.  V-8,  the  initially  flat  portion  (out  to  the  30th  airframe) 
is  explained  by  the  manufacturer  as  being  typical  of  the  initial 
production  period.  In  this  manufacturer's  experience,  the  curve  begins 
to  steepen  when: 

1.  Manpower  has  stabilized  or  reached  its  peak, 

2.  The  engineering  configuration  has  stabilized,  and 

3.  The  parts  flow  has  stabilized. 

Thus,  it  may  be  preferable  to  explain  some  points  and  exclude  them 
rather  than  to  include  them  and  bias  the  curve  in  height  or  slope. 

Whether  or  not  to  include  all  the  points  depends,  in  addition, 
on  the  anticipated  use  of  the  resulting  curve.  If  a  unit  cost  curve 
that  includes  all  costs  including  changes  is  desired,  a  line  of  best 
fit  through  the  unit  plot  points  may  be  appropriate.  If  the  curve  is 
to  be  used  in  negotiating  a  follow-on  contract,  the  effect  of  changes 
should  be  eliminated  by  constructing  a  curve  through  the  lower  portion 

* 

It  is  alno  possit,.^  to  have  a  segmented  unit  curve  as  implied  by 
Fig,  V-8  and  some  manufacturers  subscribe  to  this  concept. 
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of  the  plotted  Individual  u  It  points  as  in  Fig.  V-9.  In  effect,  this 
assumes  that  the  introducti  1  of  changes  raises  the  hours  initially 
but  that  these  decrease  ag  i  to  the  level  of  the  original  curve. 


Whatever  the  basic  technique,  it  is  important  to  remember  that  on 
logarithmic  grids  the  points  at  the  right  are  much  more  important  than 
those  at  the  left.  In  visually  fitting  a  line,  one  should  avoid  the 
tendency  to  be  unduly  influenced  by  plot  points  for  small  early  lots. 
Early  units  are  often  incomplete  because  they  are  used  for  test  pur¬ 
poses.  Also,  the  early  units  are  apt  to  include  certain  nonrecurring 
problems  incident  to  startup,  and  for  this  reason  may  be  above  the  level 
suggested  by  later  plot  points (CIR  should  help  reduce  this  problem) . 
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APPENDIX* 

Assume  a  cumulative  average  cost-quantity  curve  of  the  form 

A  -  aXb  (1) 

where  a  is  the  cost  of  the  first  item  produced, 

X  is  the  number  of  items  produced, 
b  is  an  exponent  that  measures  slope, 

A  is  the  average  cost  of  all  items  produced  up  to  and  including  X. 


In  cost-quantity  curve  parlance,  the  rate  of  change  of  cost  with 
respect  to  X  is  referred  to  as  the  slope  (S)  of  the  curve  Instead  of  b. 

S  has  special  meaning  in  that  it  describes  the  average  cost  of  2X  items 
as  a  fraction  of  the  average  cost  of  X  items.  As  aXb  represents  the 
average  cost  of  X  items,  a(2X)b  must  equal  the  average  cost  of  2X  items. 
Thus,  given  the  above  definition,  the  following  relationship  between 
b  and  S  must  hold 


Using  logarithms  to  solve  for  b  results  in 


b 


log  S 
log  2 


Substitution  of  this  expression  for  b  in  equation  (1)  results  in 


The  cumulative  average  cost  is  but  an  input  to  the  calculation  of 
the  total  cost  of  X  items  which  is  of  particular  interest.  It  is  there¬ 
fore  logical,  for  analytical  purposes,  to  work  with  the  total  cost 

This  appendix  la  the  work  of  R.  L,  Petruachell, 
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equation  Itself  which  can  be  developed  from  the  equation  for  the  cum¬ 
ulative  average  cost  as  follows-  A,  the  average  cost  of  X  items,  when 
multiplied  by  X  gives  the  total  cost  (T)  of  the  same  X  items.  This 
follows  freon  the  fundamental  idea  of  an  average.  Carrying  out  the  re¬ 
quired  manipulations  in  symbolic  form  results  in  the  following  expres¬ 
sion  for  T. 


T  -  AX 

and  substituting  equation  (2)  for  C 


and  simplifying 


(3) 


At  this  point,  observe  that  changes  in  the  value  of  a  are  reflected_ 
in  T  in  relative  fashion.  If  the  value  of  a  were  to  increase  10  percent, 
the  value  of  T  would  likewise  increase  10  percent  and  furthermore  do  so 
Independently  of  the  value  of  either  X  or  S. 

The  effect  of  X  and  S  on  T  is  more  complex.  Rather  than  try  to 
display  these  effects  by  partial  differentiation,  etc.,  which  is  pos¬ 
sible,  graphics  are  employed  exclusively.  Figure  V-10  portrays  the  solu¬ 
tions  of  equation  (3)  for  values  of  S  between  .70  and  1.00,  an  a  equal 
to  1,  and  X  between  10  and  400,  chosen  to  display  the  varying  shapes  of 
the  different  curves. 

It  appears  that  as  X  becomes  larger,  T  becomes  more  sensitive  to 
changes  in  S.  For  example,  a  shift  in  S  from  0.85  to  0.90  causes  a 
16  point  change  in  the  cost  of  100  items  and  a  65  point  change  in  the 
cost  of  400  items.  Also,  each  of  the  curves  levels  off  as  S  decreases 
leading  to  the  conclusion  that  the  sensitivity  of  total  cost  to  changes 
in  S  decreases  with  S. 


Total  cost(T) 
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An  examination  of  these  sensitivities  in  relative  terms  provides 
some  additional  insights  as  is  often  the  case  when  dimensions  are  re¬ 
moved.  Figure  V-ll,  which  is  largely  a  simplified  copy  of  Fig.  V-10, 
illustrates,  in  part,  the  calculation  of  an  index  (TR)  to  measure  the 
variation  in  T  with  respact  to  X  and  S,  A  value  of  S  designated  SN 
and  a  corresponding  value  of  X  likewise  designated  are  selected. 

These  values  as  the  subscript  implies  are  regarded  as  norms,  or  base 
points  around  which  variation  is  allowed  (indicated  by  shift  to  S  and 
T) .  X  the  index  of  relative  change  in  X  is  defined  as  the  fractional 

K 

change  in  X  resulting  from  and  absolute  change  in  S,  or  in  equation  form 

T  -  T., 


or  X-  -  =-  -  1. 

K  tn 

The  following  substitutions  and  simplifications  result  in  the  expres¬ 
sion  that  was  actually  used  to  evaluate  XR. 


log  2 


lpg  st 

log  2 


aXl  1  + 


aXll  + 


l0£_S 
lo&  2 
lo8  St 

log  2 


tr“x 


log  S  -  log  S 

,  iofT“2 


The  fact  that  the  a's  cancel  out  indicates  that  the  sensitivity  of  TR 
to  S  and  X  Is  independent  of  the  value  of  a.  Figure  V-12  shows  the  re 
suits  of  solving  equation  (5)  assuming  *  .90,  .86  s  S  s  .94,  and 


Fig.V-12 — 


Slope  ( S ) 


when  S  =0.86-0.94 


-117- 


1  <  X  -  1500.  The  vertical  axis  (T^)  indicates  decimal  fractions  of 
by  which  T  differs.  The  origin  at  the  center,  allows  changes  both 
above  and  below  T^t  to  be  indicated.  The  horizontal  scale  (S)  is  simi¬ 
larly  marked.  Figure  V-L3,  V-14,  and  V- 1 5  present  similar  displays  for 
different  values  of  S„.  The  range  of  S,  in  each  case,  was  restricted 

rl 

to  +  4  units  thus  permitting  coverage  oi  the  relevant  spectrum  with¬ 
out  overlapping  from  figure  to  figure. 

An  examination  of  Fig.  V-12  shows  that  the  relative  difference 
between  using  an  S  of  .90  and  an  S  of  .92  would  be  +  25  percent  in  the 
total  cost  of  1500  items.  Alternately,  if  an  S  of  .89  rather  than  an 
S  of  .91  had  been  used,  the  difference  relative  to  an  average  S  of  .90 
would  be  approximately  23  percent. 

Carrying  out  the  same  kind  of  exercise  using  Fig.  V-15  results  in 
significantly  greater  differences.  For  example,  assuming  an  S  of  .62 
instead  of  .60  resuLts  in  a  42  percent  overstatement  of  the  cost  of 
1500  items  and  a  25  percent  overstatement  in  the  cost  of  100  items. 

We  must  conclude  that  when  using  equations  of  this  type  to  estimate 
cost  as  a  function  of  quantity,  significant  percentage  variations  in  the 
total  cost  can  result  from  what  are  apparently  much  less  significant 
changes  in  S.  In  addition,  the  impact  of  a  unit  change  in  S  on  Tc  is 
inversely  proportional  to  the  size  of  _S_. - 


0.66  0.6  7  0.68  0.69  0.70  0.71  0.72  0.73  0.74 

Slope  ( S) 


|  Fig.  V-14 — Values  of  TR  when  S  3  0.66- 0.74 
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VI.  uNcoF-IAINI  i 

Pmln«  the  1950'*  the  difference  between  the  original  KHliwtv 
sad  the  1m«L  v.vai  u£  a  number  oi  weapon  system*  wee  *o  great  chat 
in  the  latter  part  oi  that  decade  various  agencies  began  looking  at 
c*«e  histories  or  tnc  major  cquipBor.t  item*  involved  in  an  attempt 
to  Identify  the  reaaona  for  the  discrepancies.  The  problem  i*  II- 
luatraced  by  the  table  below  (Table  Vl-l).  Her-  for  It  aircraft  and  6 
missiles  developed  prior  to  1958  the  ratio  of  late  estimate  or  ectuol 
coat  to  early  estimate  has  been  computed  and  is  shown  at  the  factor 
increase. 


Table  VI-1 

FACTOR  INCREASES  OF  THE  PRODUCT!  04  COST  OF  EQUIPMENT* 


Factor 

Factor 

Equipment 

Increase 

Equipment 

Inereuae 

Fighter 

3.9 

Cargo 

1.4 

Fighter 

2.6 

Cargo 

1.5 

Fighter 

2.0 

Cargo 

1.0 

Fighter 

1.5 

Cargo 

1.0 

Fighter 

1.7 

Missile 

14. 7 

Fighter 

Fighter 

1.7 

1.0 

Missile 

Miaella 

9.4 

4.4 

Fighter 

Fighter 

1.0 

l.l 

Missile 

Mlseila 

8.2 

1.5 

Bomber 

6.2 

Missile 

l.l 

Bomber 

2.8 

1  i 

-  i  i  >  > 

_ i _ * _ L 


Thla  table  ia  of  store  titan  historical  interest  because  factor 
increases  ere  still  being  experienced  on  some  types  of  hardware, 
particularly  spacecraft,  being  procured  by  the  government.  For  our 
purpose,  the  main  point  of  interest  is  the  reason  for  these  in  >aas , 


Taken  free  A.  V.  Marshall  and  W.  U.  Hackling,  Predictability  of 
the  Coats.  Time,  and  Sucr»is  of  Development ,  The  RAND  Corporation, 
Taper  P-1821,  December  1959. 


»«<?,  sprc  ’  !  Vi  Would  like  W  !«•*&»  ii  they  are  duo  ;«  bad  vCafc- 

estimating.  If  tt-«  problem  la  simply  this,  presumably  the  c* Lima iui 
<■  *<•  **M.f«*  to  tfc  better  end  t!.«  • liumtiun  van  be  improved.  If,  on  t lie 
other  hand,  the  problem  turns  out  to  be  poor  management,  bad  deaigu, 
irrncequacc  gdid«*n.«i  u«  *(.niwthing  ot  that  sort,  the  coat  estimator  can 
do  little  except  hope  that  the  future  will  be  better.  A  study  at  the 
development  histories  of  the  equipment  in  the  above  table  In  an  attempt 
Co  answer  such  questions  led  to  one  following  c^oc lus l one : 

When  ea-ly  estimates  are  made  of  what  it  will  coat  to 
produce  or  develop  something  new,  the  estimator  typically 
baaea  hia  estimate  on  the  current  dea  lgn  and  the  currently 
planned  program  for  development.  If  he  la  estimating  co«t 
of  production,  he  gets  a  total  coat  by  coating  the.  various 
components  as  presently  conceived  and  aggregating  those. 

If  he  is  estimating  the  coat  of  development,  he  estimates 
the  cost  of  test  articles,  engine,  ing  man-hours,  etc.,  as 
presently  planned  and  aggregates  those,  lie  does  not  specify 
what  performance  he  la  associating  with  the  particular  design 
nor  does  he  indicate  the  date  at  which  thia  performance  is 
to  be  operationally  available.  He  la  simply  costing  a  phys¬ 
ical  configuration  and/or  the  physical  resources  contemplated 
in  the  currant  development  plan. 

As  development  proceeds,  however,  these  initial  designs 
and  plana  are  almost  Invariably  changed,  either  because  of 
unforeseen  technical  difficulties  that  forestall  meeting 
performance  requirements ,  or  because  the  customer  decides 
it  is  essential  that  the  equipment  be  modified  so  as  to  keep 
pace  with  changing  prediction!  of  enemy  capabilities,  new 
operational  concepts,  and  new  technological  possibilities. 

...  In  principle  it  would  be  possible  vo  factor  into 
two  parts  the  total  error  in  cost  estimates  as  they  are  pre¬ 
pared:  (l)  the  part  d".e  to  errora  in  the  coating  of  the 
cor, figuration  supplied  to  the  coat  estimator  (i.e.  ,  the  in- 
trin.il  error  In  coat  estimating)  and  (2)  th«  part  due  10 
chauges  in  the  configuration  a*  development  progresses.  In 
practice  It  hs*  not  been  possible  to  carry  out  this  separa¬ 
tion.  However,  it  la  our  belief  that  the-  Intrinsic  errors 
in  coating  a  fixed  configuration  tand  to  be  small  relativa 
to  the  other  source  of  error^ln  the  costing  of  most  major 
items  c£  military  equipment. 

In  other  words,  of  the  two  kinds  of  errors  mentioned  above, 
requirements  uncertainty,  i.e.,  variation*  In  coat  estimates  stemming 


Marshall  and  Heckling,  otu  c It ■ 
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from  c  ha  ng •  •  In  the  tonllguritlcQ  being  coated,  la  generally  held  tn 
be  responsible  for  the  major  portion  of  factor  increases,  It  should  be 
understood  that  tit  it  kiati  of  uncertainty  if  not  found  'n  the  United 
$£#c«*  *icr««--th#  estiaecad  coat  of  the  joint  British-French  strike/ 
trainer  aircraft,  the  Jaguar,  had  at  the  end  of  1966  increased  from 
?!  Sillliou  Lo  $2,y  million  because  of  changes  in  requi ,>nUu  and  the 
final  cost  wan  at  ill  uncertain,  Mor  is  requirements  uncertainty 
i.u« ri»4>d  _o  ‘eepon  systems- ••  the  House  of  Representative* '  Rayburn 
Office  Building,  originally  expected  to  coat  about  $50  million,  ax- 
ceedea  $120  million  when  finished,  largely  because  of  design  changes 
after  Che  original  estimate  was  sad*.  Mill#  it  assy  be  impossible  to 
eliminate  discrepancies  of  this  kind  eutirely,  the  Department  of 
Defense  haa  attempted  to  minimize  them  by  Initiating  tha  Contract 
Definition  Fhaaa  (CD?)  for  major  dafense  contracts,  A  rigorous  defi¬ 
nition  of  requirement*  prior  to  source  selection  should  reduce  the 
importance  of  this  kind  of  uncertainty  In  the  future. 

Cost-eatl>id ting  uncertainty  refers  to  variations  in  cost  estimates 
of  a  system  for  which  the  configuration  is  essentially  fixed  and  can 
arise  for  a  variety  of  reaaona: 

1.  Variations  in  cost  estimates  of  a  give.,  set  of  requirements 
can  occur  purely  because  of  differences  between  cost  analysis  in  in¬ 
terpreting  the  given  requirements,  in  methodological  approach  to  the 
problem,  in  specific  techniques  used,  and  so  on,  even  If  tha  analysts 
ere  of  comparable  competency. 

2.  Co*: -eat lasting  relationships  used  in  coat  analysis  cannot  be 

assumed  to  hold  exactly.  This  simply  mesna  thr'  as**  r  certain 

WwJe  *•  W  oaponsnt  as  a  function  of  some  variable  (or  variables),  we  usually 
cannot  assume  that  these  variable#  will  predict  the  particular  coat 
with  certainty. 

5.  Coat -as time  ting  arror  can  arise  f’om  tha  fact  that  data  used 
as  a  basis  for  coat  analysis  are  themselves  subject  to  error.  Putting 
it  another  way,  tha  observations  used  in  deriving  cost-estimating  rela¬ 
tionships  invariably  contain  arrors--eveu  if  Chase  data  come  f roc 
carefully  kept  historical  records. 
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4.  In  costing  Advanced  military  s vs tens .  the  cost  analyst  very 
often  use*  coSt-rset-imeting  relationships  derived  frc»  pea*-  or  current 
experience,  here,  cue  cannot  be  very  confident  that  *  *  true torsi 
relation'  that  hold*  reasonably  well  now  will  continue  to  hold  satis¬ 
factorily  tor  tne  Advanced  system  being  coated.  in  fact,  we  frequently 
of  necessity  have  to  extrapolate  beyond  the  range  of  the  temple  or 
data  base  from  which  the  estimating  relationship  vs#  derived. 

3.  Usually  in  waking  coat  estimates  for  use  in  analyses  where 
comparative  costs  are  of  prime  concern,  the  estimates  art  made  in  turns 
of  constant  dollars,  i.e.,  in  terms  of  price  levels  pr«M"'illng  in  t «m 
base  year.  Hancn  price  level  uncertainty  la  noc  a  significant  factor. 
However,  there  ere  occasions  when  estimates  for  future  system*  may  have 
to  be  made  in  terms  of  price  levels  expected  to  prevail  in  future  years 
Here  there  is  obviously  a  potential  source  of  error  arising  from  the 
possibility  that  future  price  levels  may  In  fact  turn  out  differently 
than  originally  expected. 

6.  Toe  price -level  factor  may  cause  difftcultiea  of  e  different 
nature-  Sometimes,  for  example,  the  coat  analyst  may  obtain  data  to 
be  used  in  cost  aualyses,  and  from  the  source  it  stay  not  be  clear 
whether  the  data  ere  In  terms  of  constant  or  current  price  level#.  A 
case  in  point  is  contractor  date — either  historical  or  projected. 

Very  often  contractor  projection#  make  provisions  for  possible  wage 
rate  changes  and/or  material  price  changes.  To  be  useful  for  purposes 
of  analysis,  the  analyst  should  be  able  to  determine  the  bites  used 
for  making  thee#  projected  price  change*.  Also,  with  respect  t<> 
correcting  historical  data  for  price  leval  changes,  some  error  i* 
biuj-  to  arise  because  of  the  deficiencies  Inherent  in  moat  price 
indexes . 

The  above  listing  ia  no  doubt  incomplete,  but  it  does  give  an 
Indication  of  the  main  sources  of  coat- as  rim* ting  uncertainty.  In 
the  essence  of  a  definitive  empirical  study,  it  ia  difficult  to  e*y 
which  of  the  sources  ate  generally  of  greatest  relative  importance. 

Ia  ea  overall  context,  the  following  might  be  alngled  out? 
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Errcra  in  coat-  estlms  ting  relationship* 

1 

Error#  in  data 
Extrapolation  errors 

p^vmSALS  FOR  TREATMENT  Of  UNCE&TAIKTT 

P-o?oa»is  for  treatment  of  uncertainty  in  coat  analysis  rant#  from 
conventional  statistical  too  la  to  «•*»«  sf  •at  at  are  ct^aw&ly 

k±«ote.  a#  "fudge"  factor#.  The  latter  we  rule  out- -not  bacauaa  of  a 
high  aense  of  morality  that  aaye  uae  of  fudge  factors  la  wrong  but  on 
pragmatic  ground#,.  To  multiply  a  carefully  workeu  out  coat  eatifltaia 
by  a  owe  factor  becauae,  on  the  awrage,  estimate*  of  a  certain  type  of 
hardwara  have  been  low  by  that  aaount  may  or  may  not  improve  the  quality 
of  Che  estimate.  For  example,  uae  of  an  average  factor  for  the  case* 
of  Table  VI- 1  would  have  the  following  reauita: 

Number  of  Estimates  Number  of  Estimates 


Improved 

Degraded 

Fighters 

5 

4 

Bombers 

2 

1 

Cargo 

2 

2 

Nisailes 

_4 

_2_ 

Total 

13 

9 

To  faqirove  the  quality  of  soma  estimate*  it  ia  necessary  to  degrade 
that  of  others.  Hence  in  a r>  particular  case  the  coat  analyst  cannot 
know  in  advance  whether  uae  of  e  factor  will  be  beneficial  or  harmful. 

Conventional  statistical  tools  are  of  only  limited  value  In  coping 
with  the  problem  of  uncertainty  in  coat  analysis  because  the  occasions 
on  which  they  can  be  used  rigorously  are  qvitc  rare.  First  of  all, 
to  derive  the  conventional  statistical  measure*  of  uncertainty,  e.g., 
confidence  Intervals,  prediction  intervale,  end  the  like,  one  must 
draw  a  representative  sample  from  a  designated  population  to  be  used 
a*  a  basis  for  the  statistical  analysis,  la  coat  analysis  ot  advanced 
hardware  by  tire  nature  of  the  case,  we  usually  do  not  have  such  e 
population  from  which  to  draw  representative  sables.  (In  fact  we 
sometimes  deal  with  the  entire  universe.) 

Even  where  samples  of  e  sort  can  be  drawn,  the  sise  ot  the  sample 
la  invariably  very  sme 1 1- -  two  or  three  observations ,  five  or  six  if  w« 
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itg  lucky.  Sample  sizes  this  email  er?efn  t?>#  Ifplkibility  of  most 
statistical  theory  to  the  1 imit - -even  null  ••■pie  theory. 

In  the  rare  instances  where  tfc*  objections  ahcv-n  can  be  reasonably 

ovuccm,  we  may  still  have  problem*  because  of  difficulty  in  justifying 

the  assumptions  of  the  statistital  aoial  iu  out  pat  titular  «pp}  ic«t  ion  - 

Vox  example,  the  model  may  require  sped  fleet  ions  of  the  for*  cf  the 

distribution  function  in  the  population  from  which  the  seaple  in  drawn. 

We  are  usually  not  in  a  position  to  make  such  s  designation --for  example, 

to  make  the  assumption  of  normality.  Hie  normality  assumption  would 

not  be  so  serious  if  the  sample  sire  were  large,  But  a*  indicated 

above,  in  our  work  exceedingly  small  sample  size  is  the  rule  rather 

than  the  excaption.  One  possibility  for  dealing  with  this  problem  in 

the  future  is  to  use  non -parametric  or  distribution-free  methods  of 
e 

estimation.  While  these  methods  are  atlll  relatively  new  and  the 
theory  not  fully  developed,  the  possible  usefulness  of  distribution' 
free  methods  in  the  future  should  not  be  overlooked. 

In  addition  to  this  problem,  other  technical  difficulties  are  apt 
to  arise.  Consider  tits  csss  of  s  regression  model  using  the  “errors 
in  Che  equation"  approach--!.*. ,  that  the  eetimatlng  equation  holds 
subject  to  a  random  disturbance  (^) ,  but  that  Lhe  variables  contain  no 
error  or  at  least  errors  of  relatively  minor  significance.  A  usual 
specification  on  y,  la  that  successive  values  of  this  variable  are 
mutually  Independent  (non-autocorralated)  and  that  u  is  independent  of 
the  explanatory  variables.  Ibis  assumption  may  be  somewhat  difficult 
to  Justify  in  car  tain  srvli'tr  tuns,  however,  In  doubtful  cases,  the 

** 

non-autocorrelation  v«u  oe  subjected  to  statistical  test. 

It 

Di'ttlbut ion-free  methods  do  not  require  an  assumption  about  a 
specific  form  of  probability  distribution  function.  E.g.„  sc*  A.  M. 

Hood,  Introduction  to  the  Theory  of  Statiatlci.  Hew  Kork,  McCrew-HUl 
Book  Co  *  Inc.,  19b0,  Chap.  16. 

WE.g.,  see  Lawrence  R.  Klein.  Iconomettlo .  Evanston,  Illinois, 

Row,  ?starson  &  Co.,  1953,  pp.  89-90.  Also,  see  B.  I.  Hart  and  J.  von 
Baumann,  ’Tabulation  of  the  Probabilities  for  the  Ratio  of  the  Mean 
Square  Successive  Difference  to  the  Variance,"  Annals  of  Mathematical 
Statistics.  XIII  (1942),  pp.  2&7-214. 
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Fi.nally.  where  conventional  statistical  methods  era  applicable, 
t)  wi- i  *  i V«r  1  y  to  b*  in  fm  treatment  of  cc»L -animating  uncoi.ei.nty 
ratiuu  then  requirements  uncertainty.  Yet,  a*  we*  stated  earlier,  ,c 
(luirnMnti  un*-err«tuty  H**  b“n  i  h  £  pC  r  C  1  wyMl  .  v * i">  t  —  —  1 1  ”i  v  w$j  -  ASUS  , 

Many  pj  tiw  Lcc"™-C-i  statistical  points  raised  above  my  not  always 

b«  oi  m*j«»  »»ys*'i  i  iv*ne*  in  practi*  si  applies cisnG.  *«v**  llteivs*  w* 
milt  br  aware  of  those  m1i#i  i  in  dulling  with  cost -e*  timet  iftg  uncer- 
•  trarr  our  inre rprstatiuiii  w*  •iauwnoi  •itor»,  prediction 

intervals,  and  Um  Ilk*  may  be  kept  in  proper  context.  Finally,  it 
should  be  pointed  out  c bat  even  In  ceses  where  such  statisticel  measures 
ere  not  subject  to  rigorous  interpretation,  they  may  still  be  of  con¬ 
siderable  help  in  forming  subjective  judgments  about  the  reliability 
of  statistically  derived  estimating  relationships. 

Cost-Sensitivity  Analysis 

One  basis  for  approaching  the  problem  is  an  acceptance  of  the  ides 
that  e  certain  amount  of  uncertainty  is  inevitable  in  any  action  occur¬ 
ring  in  the  future.  Having  admitted  this,  It  is  possible  to  look  ao  e 
proposed  weapon  or  support  system,  single  out  the  areas  of  greatest 
uncertainty  end  assign  some  limits  to  them.  This  process  Is  sometimes 
known  at  cost-sens itivlty  analysis.  It  appeara  promising  because  it 
highlights  cii*  uncertainty  inherent  in  £*lure  system  costs  end  gives 
the  pleunet  a  lull  view  of  the  cost  implications  oi  decisions  effecting 
system  configuration  end  operations. 

This  type  of  analysis  is  primertf  %  ju  .  4_  u  -  »U  die  long-range  planning 
phase  of  a  program  where  system  parameters  err  still  tentative.  It  is 
also  or  more  value,  perhaps,  in  looking  at  total  system  cost  than  at 
hardware  cost  only.  As  an  example,  consider  making  an  estimate  of  the 
total  system  coat  of  an  aerospace  plane--*  manned  aircraft  diet  can  take 
off  from  a  runway,  fly  into  orbit,  and  after  completing  Its  mission  there, 
fly  back  to  earth  end  land.  Among  the  character  1st ica  that  are  unknovu 
are:  the  alee  of  the  vehicle,  the  number  of  flights  It  could  make  par 
year,  the  missions  it  would  be  used  for,  and  the  attrition  and  wear-out 
rates.  The  unknowns  far  outnumber  the  known* ,  If  indeed  there  era  any 
known*  in  a  system  as  far-out  as  this  on*.  But  let  ua  assume  it  would 
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fee  d*.*tr*bl*  for  p-irpsee*  £c  trie  cost  oi  using  *c  asto- 

SfSC*  pUiif  fur  performing  a  ntmbar  of  mfsaioAs  with  th#  coal  of  using 
ttvstsi  sui«r««<  booster:.  1 £  ;•  pvieiule,  using  a  range- 

of-value*  approach,  to  ctM*  up  with  a  range  o /  coat  estimates. 

osins  *hia  ayprccch  5»sr*£  ZuMtuy  «t  a  range  Jt  vahlcia  weights,  • 
range  of  utilisation  tales,  and  so  on.  Thus  we  wculd  hav.  i  aeries  of 
diep.ays  like  those  sketched  below.  Further,  an  analysis  of  these  will 
indicate  th*  particular  system  charac tar lo tics  to  which  total  system 
coat  is  sensitive  and  those  to  which  It  le  insensitive.  In  our  aerospace 
plane  example,  we  might  find  that  for  a  reuge  of  vehicle  weights,  utili¬ 
sation  rates,  missions,  and  attrition  and  wear-out  rates,  the  range  of 
total  syatasi  coat  la  so  great  as  to  be  meaningless-  Closer  scrutiny 
might  reveal,  however,  that  a  major  part  of  this  variation  eexaes  from  a 
single  system  characteristic,  say,  mission  altitude.  By  limiting  the 
'system  to  low-orbit  missions,  the  variation  In  cost  might  be  reduced  to  a 
range  email  enough  to  stake  meaningful  comparisons  with  other  systems  possible. 


Mission  altitude  Vehicle  weight 


This  example  is  concerned  with  roqulremer.es  uncertainty  in  a  total 
system  context.  If  we  era  interested  only  in  the  cost  of  the  aerospace 
plane  itself,  a  similar  analysis  could  be  performad  to  establieh  the 
coat  Implications  of  changes  in  wwight,  epaed,  payload  in  orbit,  typu 
of  propulsion,  etc.  Or,  if  tntarast  canters  on  cost-sttimatlng  uncer¬ 
tainty,  one  could  examine  a  range  of  notarial  or  fabrication  costs  aa 
in  the  following  example  where  new  technology  makes  estimating  mors 
uncertain  then  usual. 

The  aircraft  industry  la  continually  searching  for  new  matarlala 
that  will  b«-  stronger,  lighter,  hevc  a  higher  heat  resistance,  or  ofiar 


*«4i  Othat  advantage  ov«i  WtttUil  now  used.  At  present,  bo  r  on  - 1  1 6*  r 
rainiorcod  composite  appear*  to  offer  p.'tinCiai  -sight  saving*  a*  a 
raulecaaaent.  for  4oss  parti ?a  S.S  fit*  sU®jU»*s  susaso&ij  used,  but  it  iiao 
i»i ir.g  in  ar.  sxpsrisji.ac.ai  stags,  la  vary  expensive- -about  #700  per  pound 
Fauriuaticw  ar#  *u#t  being  verted  in'  0w«,  Sou  (iiej  »«  also 

vary  #xp#nalv«.  At  son*  lima  In  th*  future ,  hcnmvar,  Koron  material 

7-  a  r  rm,n  ^iriiebis  la  naeac;  zy  til  us*  4»  aircraft  production,  and  It 
t#  siws/s  of  interest  tu  examine  th*  possibls  offset  of  s  now  material 
on  coat  (consider,  for  axampla,  th#  speculation  about  th*  cost  of  using 
1 1  taolu*  la  th*  F-lll  and  auperaonlc  transport) . 

To  examine  tha  offset  substitution  of  boron  natartais  would  hava 
on  tha  production  coat  of  aircraft,  a  range- of- values  approach  provides 
more  information  than  a  single- value  estimate  as  wall  aa  emphasizing 
the  uncertainty  of  tha  numbers.  In  this  example,  than,  in  which  manu¬ 
facturing  coats  only  are  considered,  a  range  of  costa  is  stipulated 
wherever  appropriate.  Manufacturing  coat#  era  largely  a  function  of 
vs.ght  and  for  a  large  modern  fighter  aircraft  era  estimated  to  run 
about  $60  par  pound  (at  tha  600th  unit).  Considerable  uncertainty 
exists  about  tha  coat  at  fabricating  cheats  and  panels  of  boron,  even 
assuming  thst  computer -controlled  machine*  will  be  available.  To  allow 
for  this  unesrtsinty' w*  postulate  a  range  of  fabrication  coats,  from 
$72/ib  to  $121 /lb  based  on  optimistic  and  pessimistic  prediction  of 
persona  having  soma  axperlanca  with  fabrication  of  boron  wompoilie. 

Tha  aa  car  is  1  coat  is  compi issd  of  aluminum,  purchased  parts  and 
equtpsrtftC,  and  boron  composite.  The**  can  slao  be  estimated  ^n  a  coat- 
par- pound  basis,  and  for  aluminum  the  coat  should  be  about  what  it  is 
today- -$ 10/ lb- -with  no  variation  conaldarad.  For  purchased  parts  and 
equipment  there  is  same  uncertainty  about  tdbat  would  go  on  tha  boron 
airplane,  so  a  range  of  $60/lb  to  $100/Ib  Is  chosen  (compared  with 
$6Q/lb  for  an  aluminum  aircraft).  While  boron  costa  ar*  still  In  the 
realm  of  conjactura,  Fig.  VI-1  shows  a  promotion  of  how  they  might  dt- 
craasa  over  time.  For  thla  example,  a#  h#v#  taken  the  coat  at  three 
different  time*— $ 125/lb  in  1968.  $30/lb  in  1974,  and  $ 23/tb  in  1980-- 
with  tlta  expectation  that  tha  rnal  range  of  interest  la  cna»prla*o  of 
tha  final  two.  The  1968  figure  1*  Included  aa  a  remind*.:'  of  current 
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rsaiity.  The  manufacturing  find  material  cocti  (la  nillieafi  of  dollars) 
resulting  from  these  cost  factors  ara  shown  below ; 


!  ! 

, - -  —  _ | 

M  PA 

v’jy 

/  s  a 

/  iP 

j  $2j/ib 

Boron  Cost 

1 

High 

Low  | 

High 

Low 

High 

L am 

Manufacturing 

2.00 

1.45 

2.00 

1.45 

2.00 

1.45 

Material 

3.32 

3  09  | 

1,01 

.73 

,  5-7  - 

17  1  - 

Total 

j  5  >  32 

4.54  j 

3.01 

2.23 

2. »0 

2.02 

These  figures  show  a  possible  range  of  $5-32  million  to  $2.02  nil- 
lion,  and  a  likely  range  of  $3.01  million  to  $2.02  million.  They  also 
show  that  total  manufacturing  cost  is  relatively  insensitive  to  changes 
in  the  coat  of  boron  once  this  cost  has  declined  to  the  $50/lb  level. 

The  procedure  illustrated  above  is  applicable  to  any  situation  in 
which  costa  and/or  requirements  are  uncertain  and  limits  can  be  eeaigned 
to  the  uncertainty  with  some  assurance.  The  major  drawback  to  co st¬ 
atus  It  ivlty  analysis  is  implied  by  this  latter  condition,  since  there 
is  no  guarantee  that  In  any  given  aaelyaia  sll  the  relevant  alternative* 
will  be  Included.  Regardless  of  its  limitations,  cost-sensitivity 
analysis  la  probably  one  of  the  best  currently  available  techniques 
for  helping  deel  with  the  uncertainty  problem  In  estimating  the  coat 
of  equipment  and  weapon  systems. 

te  Carlo  Techniques 

One  sMthod  proposed  for  dealing  with  uncertainty  begins  with  the 
easuaq>tion  that  a  cost  analyst  can  describe  each  input  parameter  with 
a  probability  distribution.  This  distribution  la  then  treated  as  s 
theoreticsl  population  from  which  rend on  samples  are  obtained.  The 
methods  of  Caking  such  samples,  as  well  as  problems  which  rely  on 
these  sampling  techniques,  are  often  referred  to  as  Monte  Carlo  methods. 

To  Illustrate  the  Houte  Carlo  procedure  for  simulating  cost  input 
uncertainty,  consider  the  example  depicted  in  Fig.  VI-2. 

This  method  is  described  in  more  detail  in  a  report  by  P.  F. 
Dienwasnn ,  Estimating  Coat  Uncertainty  Using  Monte  Carlo  Techniques. 

The  RAM)  Corporation,  KM-4fe54-PR,  7#r»uary  i§66. 
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SOA^l'c  A 


Fig.  VI -2— Monte  Carlo  sampling 


x 


From  th«  probability  density,  Y  •  f(x)  ,  describing  ths  actual  (or 
aatlmeted)  input  uncertainty,  a  cuaulativa  distribution  la  plotted. 

Rest,  a  rand  oat  decimal  between  sero  and  on*  la  selected  from  a  table 
of  randan  digits.  2y  projecting  horizontally  freet  the  point  or  the 
Y-axis  corresponding  to  the  random  decimal  to  the  cumulative  curve,  we 
ti »u<  the  value  of  x  Co  ppl it  of  t ^ ;  fbla 

value  la  taken  as  a  sample  of  value  of  x. 

The  result.  If  this  procedure  le  repeated  numerous  times,  is  e 
simple  of  Input  valuea  that  approximates  the  required  input  uncertainty. 

As  seen  in  Fig.  VI- 3.  the  more  repetitions,  the  better  the  simulated  input 
distribution- 


Fig.  V I -3*- Simulated  Input  distribution 


The  procedure  for  esttaatiag  coat  uncertainty  follow*  readily 
oscs  a IwU is £i*S  Input  values  hsvt  brss  std«>  To  illuttrrtc  •  cosa Ids ■ 
the  following  «iaple  esr.iauting  reiatioosai p; 


where  C  •  coat, 

V  £  =*ightT 

P  m  coat  per  pound. 

As suae  the  actual  uncertainty  of  the  input  paraaeters  can  be  repreeented 
with  probabi.il..  diatr ibution*  aa  ebown  in  Pig.  VI-4,  with  L,  Mf  and  H 
denoting  the  loweet  possible,  neat- likely,  and  higheat  possible  valuaa, 
respectively.  Purthenaore,  sssuas  that  thaaa  values  ere  ea  follows: 


Fig. VI -4— Input  uncertainty  distributions 


Froa  the  input  distributions,  a  saaple  value  for  both  the  weight 
and  the  coat  per  pound  la  generated  by  aesm  of  the  Moots  Carlo  tech¬ 
nique.  Using  these  two  staple  values,  a  coat  ia  calculated.  The 
procedure  is  repeated  again  and  again  until  the  nature  of  the  output 
uncertainty  has  been  established.  Table  Vl-2  auaaarizes  the  procedure 
for  1000  iterations. 
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Table  VI- 2 

MOHTE  CARLO  SIMULATION  or  COST  UHCkKUl&TY 


Iteration 

w  . 

i  P  J 

r— - 

i 

S3 

405 

33,615 

2 

108 

633 

68,364 

3 

103 

374 

36,522 

4 

101 

422 

43,652 

5 

92 

3»/ 

35,604 

• 

• 

1.000 

• 

* 

• 

Mean  Values 

100 

450 

45,000 

Prow  the  act  of  coat  estimate* ,  e  frequency  distribution  as  shown 
lo  Pig.  VI-5  can  be  prepared  to  p  tray  the  cost  uncertainty.  It  la 
Interesting  to  note  chat  the  Mao  value  of  the  coat  ia  higher  then  the 
finale-value  coat  eat  lust*  <|40 .000)-- the  product  of  the  cost- likely 
values  for  each  input  factor.  The  difference  between  the  two  estimates 
occurs  because  the  uncertainty  about  the  cost  per  pound  is  skewed  to 
the  right.  If  the  uncertainty  distributions  of  both  input  factors 
ware  symmetric ,  the  two  cost  ea time tee  would  be  identical. 

Although  this  example  depicts  a  vary  simple  coating  problam,  the 
technique a  ere  epplicebla  to  more  realier'c  situations.  However,  when 
the  scope  of  the  probles  is  axpanded  it  ia  expedient  chat  tha  coating 
modal  be  programmed  for  a  computer. 

It  must  be  noted  that  i  ag  tha  Hoots  Carlo  technique  to  estimate 
coat  uncertainty  it*  this  aano«r  requires  that  all  input  parameters  he 
mutually  independent.  With  coat  factor  inputs,  we  can  probably  conclude 
that  the  assumption  of  independence  la  crus.  However,  with  system 
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requirsssUs  •»  ■**»«  »*  careful..  in  ciiu  where  a  function  »U* 
t loos hip  doss  *zitt  bctvMii  two  or  nor*  inputs,  «•  can  of  ten  circumvent 
the  latex daygada&c«  proMem  by  iscotpottuai  tha  relationship  with!* 
tb«  coat  —del;  or  If  tbo  problem  duadi,  on*  could  explore  more 
•wrotuittttd  techniques  tor  sampling  from  joint  frequency  distrlDutians. 

Coat  Istl— t*  Confidence  Rating 

la  an  entirely  different  approach  to  tha  problan  the  Air  Pore* 

Spat a—  Co— and  baa  Instituted  a  Coat  Rati— t#  Confidence  bating  (CEC&) , 

AFSC  Porn  27,  which  attanpta  to  establish  subjective  Units  on  tha  coa- 

fiieiCi  to  as  placed  in  *«ch  aaparata  iifMot  of  an  aatinata,  e.g., 

** 

airframe,  propulsion,  ate.  In  this  procedure  the  aatlaator  la  asked 
to  assign  a  value  of  from  1  to  5  to  each  of  tha  following  factors: 

Rati— tins  Conditions 

gat lusting  and  information  access 
Ground  rules  and  assumptions 
Ocher  (specify) 

Batura  of  the  It— 

State  of  tha  art 
Production  experience 
Other  (specif?) 

It—  Description 

Specification  statue 
Operating  program  characteristics 
Coat  Methods  and  Pete 
Methods 
Data 

▲  rating  of  1  on  Rati— ting  Tim*  and  Information  Ac* ess,  for  ex- 
anpla,  mean a  "there  wee  complete  aceaaa  to  available  date  needed  to  coat 

*D.  J.  riant*,  "Frequency  Distribution  of  Deviation  from  Means  and 
Regression  Lines  i  Semples  from  «  Mult  1 -variate  Bor— 1  Population," 

The  A— Is  of  Matua— tlcal  Statistics.  Vol .  17,  1946. 

—  ~  . 


Described  in  A73CL  173-1A,  Attachments  3-fl  through  3-14. 
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t  his  lira  and  there  was  Ample  t lac  to  thoroughly  research  the**  source*. 
A  rating  of  3,  un  the  other  trend,  implies  thet  "the  dominating  source  of 
uncertainty  he*  bs«o  the  cMpleteiy  inadequate  amount  vi  um  provided 
to  Mike  the  ea  t  tear*  «!» d/or  the  complete;  Lack  Oi  MCicia  to  useful  data 
source.,"  From  the  ratings  assigned  to  each  factor  a  consolidated  Con¬ 
fidence  raring  is  determined  (ncmsll>  the  etithmatlc  Man  Of  the  rating 
assigned  to  the  individual  factors)  which  expresses  the  estimator's 
GVSiiii  c~fri  idence,  in  ffSwItlww  iu  the  ratings  AF3C  Tons  27  calls  for 
an  estimate  of  the  must  likely  cost,  lower-bound  cost,  and  upper -bound 
coat.  These  upper  end  lower  bounds  presumably  stem  from  the  uncertain¬ 
ties  previously  specified.  A  sample  f orv  is  shown  in  Fig.  Vi-6. 

While  from  an  operational  point  of  view  It  Is  not  clear  what  the 
redolent  of  an  estimate  does  when  he  is  told  to  give  the  estimate 
little  credence,  documentation  of  the  sources  and  extent  of  uncertainty 
in  an  estimate  should  be  helpful.  Also,  the  need  to  specify  which  es- 
1 1st* ten  he  is  most  uncertain  about  and  why  may  spur  the  estimator  to  do 
a  better  Job  on  these  items.  Thus,  while  the  AFSC  CfcCK  is  still  exper- 
ismtntal  and  cannot  be  evaluated  empirically  as  yet,  It  does  represent 
a  constructive  step  in  the  right  direction. 

Better  Information 

One  better  solution  is  sometimes  feasible,  given  the  seise  condi¬ 
tion  necessary  to  usa  coat* sensitivity  and  Koole  Carlo  techniques, 
l.e.,  that  the  areas  of  uncertainty  can  be  defined.  This  solution  fa 
to  reduce  or  eliminate  uncertainty  by  obtaining  better  knowledge,  which 
in  effect  la  the  purpose  of  the  Contract  Definition  Phase  of  hardware 
procurement.  A  careful  spelling  out  of  requirements  and  design  specifi¬ 
cations  can  a  limine te  much  of  the  uncertainty  that  pervades  a  conceptual 
study.  Or  actual  tests  may  be  performed  to  obtain  more  knowledge,  as 
in  the  esse  of  the  supersonic  transport  where  both  Boeing  and  Lockheed 
fabricated  a  number  of  parts  out  of  titanium  to  gain  Information  on  the 
coat  of  working  with  this  metal.  In  that  aitualion,  the  need  to  reduce 
coat-ea Claia ting  uncertainty  Impelled  both  companies  to  spend  several 
millions  of  dollars.  The  government  cost  estimator  may  never  have  the 


—Cost  estimate  confidence  rating 


resource*  lor  «  similarly  Mtiivi  attack  on  his  own  pro  l ,  but  the 
t  n  if  I  g  is  Instruction  bomUmUm.  Uncertainty  can  b«  reduced  In  soa# 
instances  by  experimentation,  la  others  by  bettor  definition,  aod  in 
all  by  increased  knowledge-  nevertheless.  tha  cautionary  note  sounrisd 
by  the  World  W#$eoroio»{.<£ 1  Drjisicstica  in  1*66  so  the  subject  of 
weather  fcrecss  Cir^  is  probably  applicable  here; 

The  basic  character  is  tics  of  uncar  tairsty  will  almost 
ewrely  eefiiissss  ts  !•&- spsrstleaa lly  slgutft-sni  t«  Ure 

foreseeable  future. 
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10.  ABSTRACT 

il  KEY  WORDS 

This  Ktsortndus  is  the  introductory  per-  Cost  analysis 

tion  of  a  tsxt  on  the  />eo«rtl  suoject  of  Cost  tfrecMvtnui  stuu**»a 

cost  estimating  procedural  being  prepared  Cost  estimating  relations:.; 

at  tna  request  or  the  Office  of  the  As-  dtatisticai  oetboas  ana 

sistant  Secretary  of  Defense  (Systems  processes 

Analy  si  s  )  7'";' The  study  dlacuaaee  the  Uncertainty 

fundaasntal  probleas  of  estimating  aajor  Probability 

equipaent  costs  and  suggests  that  for  nany 

purposes,  particularly  for  government  cost 

analysts,  a  statistical  approach  is  the 

aost  suitable.  The  kind  of  data  requirea 

ana  the  adjustments  needed  to  make  the 

data  useful  are  dlacusaed  in  soma  detail. 

The  use  of  regression  analysis  in  deriving 
cost  estimating  relationships  Is  described, 
but  it  la  emphasised  that  unquestioning 
use  of  estimating  relationships  obtained 
in  this  manner  caa  result  in  serious 
errors.  The  concepts  underlying  the  cost- 
quantity  relationship  generally  known  as 
the  learning  curve  are  presented  along  with 
instructions  for  its  use.  finally,  the 
problaa  of  uncertainty  in  cost  estiaating 
is  discussed,  and  a  few  suggestions  for 


dealing  with  the  problem  are  Included. 


I 


